refactor: per-CPU runqueue data structure (SMP Phase 2)
Replace global rq_active/rq_expired with per-CPU runqueue array:
- struct cpu_rq: active/expired runqueue pair + idle process per CPU
- pcpu_rq[SCHED_MAX_CPUS] array replaces global runqueue pointers
- All enqueue/dequeue operations now index by process cpu_id field
- schedule() uses percpu_cpu_index() to select local CPU's runqueue
- process_init() initializes all CPU runqueues, sets pcpu_rq[0].idle
- Added cpu_id field to struct process (set to 0 for now)
- rq_pick_next() takes cpu parameter, swaps per-CPU active/expired
- All wake paths (kill, signal, sleep wake, exit_notify) enqueue
to the target process's assigned CPU runqueue
All processes still assigned to CPU 0 — Phase 3/4 will activate
AP scheduling and load balancing.