Section 0b

Introduction to File Management


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Files and Folders and Directories

Oh My!

Many students are already familiar with files, file folders, directories, and so on. If you are, you might still want to peruse this chapter for a little heads up that you have probably never been taught before.

However, other students have never really manipulated files directly, and that's okay! You come to college to learn stuff and if you don't know it, now is the time.

True Story: A few years back, there was a student who was asked to plug a network cable into a laboratory computer. She had no idea what that was. In that class and others, she learned her stuff (including what a network cable was), got her degree, and she is now a team lead at a major international software development company. Again, students come to College or University to learn. And by definition, that means it's okay not to know stuff right out of the box. Never let not knowing stuff be a problem.

First, The Basics

Everything that is stored on a computer is data. Data is stored as binary values (i.e., ones and zeros) in either volatile memory that is used during normal operations and lost when the computer is powered down, or non volatile memory also called storage. This will be found on hard drives, memory sticks, and other devices called, believe it or not, storage devices. Non-volatile memory is said to persist or remain valid even after the computer is shut down.

With that said, all data stored on a computer is stored in the form of a file. This is a metaphorical term borrowed from historical office operations where data with specific characteristics were stored in a manila file.

There can be files of all kinds on a computer. There are executable files which are the programs that run, there are data files which store the data that is being manipulated such as word processor files, spreadsheet files, web page files, and so on. In some systems, these commonly have extensions or appended characters that indicate what kind of data is held such as .exe, .doc, .xls, or .html.

Next, The Management

Obviously if there are files on a computer - and there are commonly tens of thousands of files on a given computer - those files must be managed in some way so that users can find and manipulate them.

One way to manage all the files would be to just put them on the storage device all together. However, finding one file for a science laboratory report among tens of thousands of other files would be a nightmare. It would just not work.

So, the wonderful folks who created operating systems decided to provide a format for organizing the files. They created organizing structures called directories or folders again borrowing from historical office operations.

Operating system files - and there are a lot of these - are organized under one folder, and then under many, many subfolders (or subdirectories) below. Program files are organized under some kind of directory possibly called "Programs" or sometimes "bin". Bin is short for binary file, and this is the folder name used for executable files in some systems. Side note: As mentioned, all files are actually binary but the designers had to come up with something that was different from various data formats.

And Finally, So What?

Computing Scientists, and indeed anyone who wants to have reasonably efficient and timely access to their files should take advantage of folders to manage their own files.

Most systems will initialize the computer's filing system with some kind of master document folder. In MS Windows, the directory is called, amazingly enough, "Documents". In Linux or Unix, there is a folder called "users" with a sub folder that has the user's login name. Users should probably create a "Docs" or "Documents" folder since other files might reside under the user's primary folder as well. Under the documents folder, in whatever way it is named, a reasonable overall organization of a student's data files might start with folders such as:

Creating the above file organization requires creation of new folders. For students not familiar with how to create a new folder, there are myriad video and text tutorials for this on line, and it is never difficult. Directories or folders can even be created from the command line if a student chooses to use that form of computer interaction.

Now consider the following expansion of the Courses directory. Each line is a subdirectory or sub folder of the Courses folder.

Courses
|
_ CS 105
|
_ CS 136
|
_ CS 136L
|
_ BIO 181
|
_ MATH 136

And then an individual course can be broken down further, as follows.

Courses
|
_ CS 105
|
_ CS 136
|
_ Syllabus
|
_ Class Notes
|
_ Assignments-Projects
|
_ PA01
|
_ PA02
|
_ PA03
|
_ CS 136L
|
_ BIO 181
|
_ MATH 136

Again, this is more organizational work than might be required, however, as courses progress and things get more complicated - and things are going to get more complicated - the student who takes the time to think this through and organize her or his file system will be the one who spends less time searching for, and/or sometimes losing, important files.

And Then?

This online reference is not designed to teach students to just code. It is designed to teach students to think, and to develop well structured and organized programs. File systems are a great place to start. Creating a well organized and well structured folder/directory system is not only a first step toward developing high quality programs, it is also evidence of a thing called "hierarchical thinking". Another primary goal of this online reference is to teach students how to think at higher levels. Here is where to jump in.

When you download and/or create your first projects, you want to put all the associated files in the same directory. Here is an example of what is called a GUI or Graphical User Interface (i.e., Windows, Linux, Mac, etc.) directory

The highlighted files are the ones you must start with but if you notice the "Project_1.exe" file, that is the one that will be generated when you compile the other files together. Here is an example of a CLI or command line interface directory. You will use dir (directory) in Windows or ls (list) in Linux/Unix/Mac to bring up the list of files.

When you have organized all the files in the one correct directory, you can conduct the compiling process which would look like the following in this particular situation. You will see these operations in the next few sections depending on which system you have but it never hurts to look at things twice.

gcc -Wall Console_IO_Utility.c project_1.c -o Project_1

And What, There's More?

The last thing (seriously) to mention here is that sometimes storage fails, and sometimes humans fail to use storage correctly. If you have data that you don't want to lose, such as your English term paper, or say, a program you have spent several days writing, you MUST back it up on or in a device that is physically separate from your current working device (e.g., your computer).

There are versioning systems such as git or Subversion (and lots more) that can be used. As a Computing Scientist, learning one or more of these is an important part of your education. BUT, when using these as a student, students must make sure that any school work is stored under a private mode or setting. Posting code, or any other school work, in a public mode is almost always an assurance that students will run afowl of Academic Integrity violations when other, less scrupulus, students copy their code. This is another piece of the thinking part of your education. Don't let this happen.

There are other external storage devices that can be plugged into a computer and/or "cloud" storage capabilities that can be used. It doesn't matter what external storage is used or in what way it might be used, good Engineers of any kind, including Software Engineers, will assume that their device is going to fail sooner or later, and they will act on that assumption immediately. Murphy (the reputed source of Murphy's Law) is still in charge of all projects and work at all times, and he is ruthless, and he can cost you your grade.