Monday, October 19, 2020

Fundamental Concept of Operating Systems

 

An operating system is simply an interface for the user (whether human being or program) to interact with the hardware of a computer. The major functions of an OS are Storage management, memory management, Protection and Security, and Process Management. Protection and security are responsible for protecting the computer by limiting access to certain files, and security is responsible for guarding the system from attack either internally or externally. Process management is responsible for managing the CPU and managing processes being executed, suspended, synchronized, or communicated with. Memory management keeps track of which parts of memory are in use and who is using it. It also decided who gets to use the memory and allocates as necessary. Storage management is broken into a few nodes. File system management keeps charge of the file system, creating and deleting files and directories to organize files, as well as mapping files into secondary storage. Mass storage management takes charge of free space and storage allocation and disk scheduling. Caching is the mean by which the system uses memory to optimize performance. Basically, these few systems make up the base for a functioning OS.

Underlying the OS are some system services that are pivotal to the operation of the OS. These services provide an environment for programmers and users to access the functionality of the parts of the OS. I have outlined these in the concept map below.



A program is a stagnant state object that is waiting to be executed. Think of it as code that is written, but it is just words. You need a complier to make it a process. Better yet, a cookbook is only paper, but when you work one of the recipes it becomes food. So, cooking is a process and the cookbook is the program.

A process is an active entity and has 5 distinct states: new, ready, running, waiting, and terminated. It was funny to me to spend so much time on each of these states. Why? Because they seem overtly obvious. The naming of the states could not have been more common. It would have been like naming Florida, Sunny, or Nebraska, Flat. New is changing from program to process (getting started), ready getting ready to run by having resources assigned to the process, running is well… running, waiting is waiting for some type of I/O to happen (press yes or no) and terminated is we’re done here.

The process control block (PCB) is responsible for coordinating the moving of a process through these states. It is additionally responsible for coordinating the resources during each state. It controls the state, increments the program counter, manages the CPU registers, CPU scheduling, accounting (funny way of saying all the “numbers” stuff, like time or process numbers), I/O status (info on the physical resources allocated to the process), and memory management. So, it does a lot.

Threads are such a huge part of the modern computer it is strange how quickly they can be defined, but how much time the text spent talking about them. Perhaps I missed a lot of what it was saying but I got the ideas. There are two type of threading (if that’s the right word) single and multi. Ironically, you would think that multi-threading is the best and you’d be right for most computing that requires human interaction. A single-thread process may be ideal in some embedded applications, but for the most part multi-thread is the way we go. Think of them like this, if you need to tell your entire company about an upcoming event and you chose to send an email about it. If you chose single-thread you need to send an individual email to each person in the company. Likewise, if you chose multi-thread you simply select copy everyone to a single email. I really feel like I oversimplified this, so there are parts to each thread; the code, data, files, registers, stack, and the thread itself. Single-thread has one of each, multi-threads share a stack and registers. So, the overhead of the multi-thread is spread out across several resources allowing parallel processing==increased performance and happier users.

Critical-Section Problem sounds like we’ve just gotten into the part of brain surgery that is most tricky and dangerous and “we’ve got a problem”. In fact, it is when two sections of two processes that are critical to the running processes try to pass through the process at the same time. It’s so easy to see, but hard to describe. I guess it would be similar to listening to a book on your phone and your just getting to the climax when your phone rings. The call is one you have been waiting for, but you really don’t want to stop reading at the moment. So, you can either pause the book, or silence the call. But what to do? You’ve reached the critical-section problem. So, when this happens in a process, one solution offered by the text was to assign a start and end marker to each critical section. Thus, when a critical section is processing and another try to start it cannot until the end marker is passed by the processor. This could cause delay but would prevent critical data from being missed and a crash occurring.



Memory management is key to the operation of a computer. Because the operation of a computer relies on movement of processes through different channels of memory a management unit is critical to the operation. Relative speed and protection are the major players in memory management. We must keep the user data, multiple user data, and OS data separate. While we are keeping things safe and separate we need to keep speed and performance in scope as well. This is accomplished through hardware, address binding.

One way to keep things safe is by providing a separate memory space for each process. To keep the space separate two registers (base and limit) are assigned with fixed/legal addresses that define a range. If the instruction address is outside this range it is deemed illegal (does not belong to the process in execution). This results in a trap, or fatal error.  

The other way is through address binding within the memory management unit (MMU). The instruction’s address location is bound to the data associated with it. Most common binding method is execution or run-time binding where the binding cannot occur until run time. When the CPU assigns (binds) it’s addresses to a program this becomes a virtual address. When the MMU assigns the physical location, this becomes the physical address. The set of all virtual addresses for a program is the virtual address space, and the set of all physical addresses corresponding to these virtual addresses is the physical address space (Silberschatz et al., 2014).  Within this space the base register becomes the relocation register and its value is set by each address generated by a user process when it is sent to memory (Silberschatz et al., 2014). So, when the virtual memory location is passed to the MMU the relocation register assigns a physical address that is the combination of virtual and physical addresses within the physical address space and loaded into memory bound to the process data.

Memory management is designed to keep the processes from getting mixed up with one another and the operating system. It uses hardware and address binding to keep things straight within the confines of logical and physical address space. It is also responsible for managing the speed of the instruction movements from storage to memory to CPU cache.




A file system is the life blood of a computer. Without a file system you have a fancy set of wires, silicone, plastic and copper parts. It would accept a charge and that’s about it. The file system is in control of the computer. The file system is responsible for housing all of the files that make the computer function. It is also responsible for managing those files and keeping up with where they are located within storage for easy retrieval. There are 6 basic functions of the file system, creating a file, writing a file, reading a file, repositioning within a file, deleting, and truncating a file. If there were no file system, the computer would not function.

A directory is like the card catalog for the file system. Each file system must have a directory to keep things straight. The directory contains the map of where each file is in the file system. It uses the files unique name, identifier, and path to maintain order with in the file system. There are many structures a directory can use to maintain the file system. The text indicated the following 5 generic structures.

Single-Level Directory
This structure is fairly self-explanatory. It is a single directory with all files in one directory. It is flat with everything “right there”. While simple it has a single glaring problem. With all files in one directory, as the directory grows the file system gets increasingly more difficult to manage. Keeping track of all those file names is increasing challenging. Because of this limitation a multiple user system is virtually impossible.

Two-Level Directory
This structure is similar to the single level but has a new higher-level directory for each user. A Master File Directory is accessed when a user logs on. Their specific user level directory is then accessed. So, provided the files are saved for a specific user, the files are kept separate with separate directories. Again, this sounds good and is good for most personal computers, but it lacks a good way to share files across users. If a file is saved in one user directory, it cannot be accessed by another user. There is no sharing with a two-level directory. 

Tree-Structure Directory
This structure is sort of a two-level within a two-level and so on. There is a main directory that contains the user specific directories. The user specific directories contain subdirectories to keep all of the many files in order and maintain efficiency and performance. There aren’t really drawbacks that I would consider significant, but there are a couple of note. Files in the structure can be shared, but there must be specific permission granted between the user directories. And the path name must be specified to change to the correct directory and access the requested file. This is the most common of all directory structures.

Acyclic Graph Directory
This file structure is a little confusing. It is a tree structure with the ability to share subdirectories and files. This is not sharing two copies of the same file this is a living document between two users. Where two users are working in the same file or directory and see the changes in real-time. The challenge lies in how do we all work with the file if it only exists in one place? Of course, you could copy it to your directory, but then you are not sharing. So, enter the concept of a link. We all know what it is, it is simply a pointer to the location within the shared directory for the file we want. But, how does the system handle a shared file it receives a request to delete the file? Are there any instances of the file still in use? Do you really want to delete the file? By using a link for the file, it can monitor (a simplified term) the link to see if anyone is using the file, and thereby continue with the delete or deny it.

General Graph Directory
With the acyclic structure the system tries to avoid cycles within the directory. Meaning it attempts to search the directory without having to cycle back over areas that were already searched. In the general graph structure cycles are not the first concern. This structure is basically what happens when you add links to a basic tree structure. With a tree structure you can share files and directories, but they are direct access to the files/directories, they are not done by link. When the link is added you get a graph structure where multiple directories and multiple files can be shared via links and direct path location. The drawback to this is the cycle. If we search in an acyclic graph the search does not scan the same sections twice, avoiding the cycle. With a general graph the same sections may be scanned more than once, resulting in performance issues and possible infinite loops when the counter is incremented with each cycle. To compensate a cyclic graph will institute “garbage collection” marking each file as it scans and then cleaning up behind itself as it goes, prevent a rescan of the same section. The is expensive and time consuming, so it is not very popular.

 

I/O devices vary greatly. They can an external device such as mouse, keyboard, printer, or monitor. They can also be internal devices like a network (PCI) card, or GPU. These are connected to memory via buses and communicate with the CPU through driver software and controllers. There are controllers at each end of the communication, host controller internal and device controller external. The I/O device controller sends it’s request to the host controller. The host controller moves the request to memory where the CPU can translate and respond to memory. Then the host controller returns the action to the device controller. Its then translated and physical action takes place.



Security and protection are ever growing aspects of modern computers. As hardware and software grow in complexity so do the threat of malicious attack. "Protection refers to a mechanism for controlling the access of programs, processes, or users to the resources defined by a computer system. Security, is a measure of confidence that the integrity of a system and its data will be preserved." (Silberschatz et al., 2014) There are two main principles of protection, the principle of least privilege and the need-to-know principle, that help to define the process of protection. The principle of least privilege is the idea that a program or user is given “just enough privileges to perform their tasks,” (Silberschatz et al., 2014). This principal is funny because it sounds like we are giving them just enough rope to hang themselves. The need-to-know principle, centers around the concept that having enough access is great, but we need to give just enough knowledge to complete the requested task and nothing more (Silberschatz et al., 2014). From these two principals we derived the language or domain based protections.

Language based is protection written into the code of a user program. It is less secure than kernel-based protection. It relies on the accuracy of the programmer and the accuracy of the complier and translator. And it is more prone to malfunction than a hardware-based protection. It is more flexible in its approach though. Language-based protection can be easily altered to make necessary changes after it is programmed. Hardware-based protection is less capable of being adjusted if there are flaws in the protection model.

Domain based protection is centered around a domain containing a set of rules for the processes attempting to operate within it. When a process requests operation it is assigned to the domain with the permissions required to carry out the operation. If a program needs access to a specific file to operate, it will need to be assigned to the domain with permission to that file. If the program does not have the key for that domain, it will not be granted access to the file. Domain based protection is flexible in that if a particular access is needed the domain can swap with another to provide the access. It does however, suffer from a difficulty of maintaining and revoking permission to specific locations if multiple programs access the location.

The access matrix is a grid system, file table that indicates permissions based on domains. Each file is assigned to a domain and then the permission is assigned. Here’s the example provided in the text, (Silberschatz, Galvin, & Gagne, 2014). This is at its most simple form. Note that each domain can also be described as an object within the access matrix to allow for domain switching, a technique to allow different permission to objects as they are needed.


          Security is insured by the protection method that we choose. There are 4 high-level classes of security that need to be addressed; physical environment, user/human, network, and operating system. Each of these poses its own vulnerabilities and will need some form of protection. Physical environment indicates that the location of the computer needs to be protected from unauthorized access. The user or human needs to have correct or granted authorization to access the device and its contents. Network security deals with protecting connected devices from outside threats via hardware and software protective programming. Lastly, the operating system needs to be protected as described above with either hardware/domain based or language based security. There are many ways a computer can be attacked and each of them must be addressed to maintain security.






This has been a great study of the fundamentals of an operating system. It was very helpful to

learn how the software talks to the hardware. Part of this course has taught me that I may 

want to consider a career path in OS development or engineering architecture. 

 

References

Silberschatz, A., Galvin, P. B., & Gagne, G. (2014). Operating system concepts essentials (2nded.). Retrieved from https://redshelf.com/











No comments:

Post a Comment

Well, dreams, they feel real while we're in them right?

  Since we are both relatively new to the concepts of programming computers lets be basic. I am not going to assume that you have advanced k...