Day6 of "Learning and sharing about Linux"
Inodes & Metadata: How the kernel identifies files without names
In the eyes of Linux kernel filename is just a convenient alias for the humans. to the system a file is an inode.
an inode(index node) is a data structure that stores everything about a file except its name and the actual data content.
it contains,
- File type: (regular file, directory, symbolic link, etc...)
- permissions: (r, w, x)
- owner info:
- size
- timestamps: created, accessed, modified
- pointers: the physical address on the disk (blocks) where the data actually lives
Then how the kernel finds the data?
- Dentry lookup: the kernel looks at the directory entry(dentry). a directory is actually just a special file that maps a Filename to a inode number
- inode retreival: once the kernel has the inode number it looks at the inode table for that entry
- data access: inode tells the kernel where the data lives in the hardware in blocks for example 500, 502 and 505
this is why moving a file in the same partition is instantaneous because the file blocks doesn't move only the pointers change
some extra info:
- every file needs a inode so if you have millions of 1KB files you'll run out of inodes before your storage runs out
- a hard link is just two pointers to the same inode, the file is deleted only when the pointer counts reduces to zero
- when you delete a file kernel isn't actually deleting the data in those blocks it just marks the inode as free and the data blocks as available, so next time you write something the prev stale data gets overwritten.
`df -i` shows you how many inodes are left on your disk
Here's a example of what a inode metadata actually looks like
Day5 of "Learning and sharing about Linux"
in Linux everything is a file. It is a fundamental architectural principle that makes the operating system more powerful.
whether you're interacting with a text document, a hard drive, a keyboard, or even a network socket OS treats them all as a stream of bytes.
if this principle wasn't there engineers had to design 50 different API's for 50 different things keyboard, file, network, USB etc.. etc..
There are three standard streams also knows as file descriptors:
1. stdin 0
2. stdout 1
3. stderr 2
Redirection: changing the flow of these streams
with redirection you can direct the output or input to a location of your choice than the default.
1. output redirection:
- ls > files.txt overwrites files.txt with the list of files
- ls >> files.txt appends the list to the end without deleting the existing content.
1. error redirection:
- `grep "search" important.go 2> errors.log` redirects only the stderr to the errors.log
- `command > output.log 2>&1` first one tells redirects the stdout, second one tells the process to redirect the stderr(2) to where stdout(1) is going. (aka both stdout and stderr goes to output.log)
Piping:
i wrote about this yesterday, here's a brief
`ls | grep "main.go"`
bash spaws a child process for ls
ls outputs a massive list (stdout)
parent process waits for the child's output
`|` pipe grabs that output and passes it to grep
then again a child process is spawned for grep
I'm quoting a more detailed explanation about pipes here 👇