When a page table entry(PTE) is not marked as valid, it means the data needed is not in memory, but on the disk. So now page fault happens and the OS is responsible to load this page of data from the disk to memory.
My question is, how does the OS know the exact disk address?
You are asking in a system dependent manner. A PTE not marked as valid may mean the address does not exist at all in the process address. A system may have another bit to indicate that the address is valid but logical to physical mapping does not exist.
The operating system needs to maintain a table of where it put the data.
The data can exist in a number of places. 1. It might be uninitialized data that has no mapping anywhere. Respond to the page fault by clearing a physical page and mapping it to the process address space.
It might be in the page file.
Some systems have a separate swap file.
It might be in the executable or shared library file.