Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.
Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to. getsizeof() calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector.

import os
import time
import gc
import signal
from mem import total_size  # see https://code.activestate.com/recipes/577504

# disable GC
gc.set_threshold(0)

long_list = list(range(1000000))
print('the list use {} MB mem'.format(total_size(long_list) / (1024 * 1024)))

def sleep_forever():
while True:
time.sleep(1)

pid = os.fork()  # noqa
if pid == 0:
# child
signal.signal(signal.SIGINT, signal.SIG_DFL)
time.sleep(16)
for i in long_list:
print(i)
# gc.collect()
sleep_forever()
else:
# parent
print('parent pid {}'.format(os.getpid()))
print('child pid {}'.format(pid))
try:
sleep_forever()
except KeyboardInterrupt:
os.wait()


Copy-on-write finds its main use in sharing the virtual memory of operating system processes, in the implementation of the fork system call. Typically, the process does not modify any memory and immediately executes a new process, replacing the address space entirely. Thus, it would be wasteful to copy all of the process's memory during a fork, and instead the copy-on-write technique is used. It can be implemented efficiently using the page table by marking certain pages of memory as read-only and keeping a count of the number of references to the page. When data is written to these pages, the kernel intercepts the write attempt and allocates a new physical page, initialized with the copy-on-write data, although the allocation can be skipped if there is only one reference. The kernel then updates the page table with the new (writable) page, decrements the number of references, and performs the write. The new allocation ensures that a change in the memory of one process is not visible in another's. The copy-on-write technique can be extended to support efficient memory allocation by having a page of physical memory filled with zeros. When the memory is allocated, all the pages returned refer to the page of zeros and are all marked copy-on-write. This way, physical memory is not allocated for the process until data is written, allowing processes to reserve more virtual memory than physical memory and use memory sparsely, at the risk of running out of virtual address space.

So what happens when one of the processes attempts to write to the virtual page at 0x8000? Suppose process A writes to the page first. Because the PTE allows only read accesses, this write triggers a page fault. The page fault handler goes through the steps described in the previous section and first locates the matching vm-area. It then checks whether the vm-area permits write accesses. Because the vm-area in process A still has the access rights set to RW, the write access is permitted. The page fault handler then checks whether the PTE exists in the page table and whether the PTE has the present bit on. Because the page is resident, both of these checks pass. In the last step, the page fault handler checks whether it is dealing with a write access to a page whose PTE does not permit write accesses. Because this is the case, the handler detects that it is time to copy a copy-on-write page. It proceeds by checking the page frame descriptor of page frame 100 to see how many processes are currently using this page. Because process B is also using this page frame, the count is 2 and the page fault handler decides that it must copy the page frame. It does this by first allocating a free page frame, say, page frame 131, copying the original frame to this new frame, and then updating the PTE in process A to point to page frame 131. Because process A now has a private copy of the page, the access permission in the PTE can be set to RW again. The page fault handler then returns, and at this point the write access can complete without any further errors. Figure 4.35 (c) illustrates the state as it exists at this point.

Note that the PTE in process B still has write permission turned off, even though it is now the sole user of page frame 100. This remains so until the process attempts a write access. When that happens, the page fault handler is invoked again and the same steps are repeated as for process A. However, when checking the page frame descriptor of page frame 100, it finds that there are no other users and goes ahead to turn on write permission in the PTE without first making a copy of the page

If the page is loaded in memory at the time the fault is generated, but is not marked in the memory management unit as being loaded in memory, then it is called a minor or soft page fault. The page fault handler in the operating system merely needs to make the entry for that page in the memory management unit point to the page in memory and indicate that the page is loaded in memory; it does not need to read the page into memory. This could happen if the memory is shared by different programs and the page is already brought into memory for other programs.
The page could also have been removed from the working set of a process, but not yet written to disk or erased, such as in operating systems that use Secondary Page Caching. For example, HP OpenVMS may remove a page that does not need to be written to disk (if it has remained unchanged since it was last read from disk, for example) and place it on a Free Page List if the working set is deemed too large. However, the page contents are not overwritten until the page is assigned elsewhere, meaning it is still available if it is referenced by the original process before being allocated. Since these faults do not involve disk latency, they are faster and less expensive than major page faults.

ps 命令提供了 min_fltmaj_flt 两个指标。蠢作者这里一共会发生 8036 次 min_flt

[before loop]
PID User       Command                         Swap      USS      PSS      RSS
22296 root     python cow.py                      0      296    54372   109588
22162 root     python cow.py                      0      296    54768   113076

[after loop]
22296 root     python cow.py                      0    32064    70365   110320
22162 root     python cow.py                      0    32064    70627   113076
// 22162 is parent; 22296 is child


PS. 简略说一下 USSPSSRSS 这几个指标的含义

• USS (Unique Set Size) This is the amount of unshared memory unique to that process
• PSS (Proportional Set Size) USS + 共享内存/拥有此共享内存的进程数
• RSS (Resident Set Size) USS + 共享内存 以上三个指标均指实际在内存中的部分，不包括 swap 中的 可以参考 Stack Overflow: What is RSS and VSZ in Linux memory management

➜  ps_mem -p 22296
Private  +   Shared  =  RAM used       Program

31.3 MiB +  37.4 MiB =  68.7 MiB       python
---------------------------------
68.7 MiB
=================================


# 变量 i 会使 long_list 中每个元素的引用计数 +1/-1
for i in long_list:
print(i)


### Reference

Python vs Copy on Write