Skip to content Skip to sidebar Skip to footer

How to Read a File Into Python

Watch Now This tutorial has a related video course created by the Real Python squad. Watch it together with the written tutorial to deepen your understanding: Reading and Writing Files in Python

One of the most common tasks that you can do with Python is reading and writing files. Whether it's writing to a simple text file, reading a complicated server log, or even analyzing raw byte data, all of these situations require reading or writing a file.

In this tutorial, you'll acquire:

  • What makes up a file and why that'south important in Python
  • The nuts of reading and writing files in Python
  • Some bones scenarios of reading and writing files

This tutorial is mainly for beginner to intermediate Pythonistas, but there are some tips in hither that more advanced programmers may capeesh too.

What Is a File?

Before we tin go into how to work with files in Python, it'south important to understand what exactly a file is and how modernistic operating systems handle some of their aspects.

At its core, a file is a contiguous set of bytes used to store data. This data is organized in a specific format and can be anything as simple as a text file or equally complicated every bit a program executable. In the end, these byte files are then translated into binary 1 and 0 for easier processing by the calculator.

Files on most modern file systems are composed of three main parts:

  1. Header: metadata virtually the contents of the file (file name, size, blazon, and then on)
  2. Data: contents of the file as written by the creator or editor
  3. End of file (EOF): special character that indicates the end of the file
The file format with the header on top, data contents in the middle and the footer on the bottom.

What this information represents depends on the format specification used, which is typically represented past an extension. For case, a file that has an extension of .gif virtually likely conforms to the Graphics Interchange Format specification. In that location are hundreds, if not thousands, of file extensions out there. For this tutorial, you'll only deal with .txt or .csv file extensions.

File Paths

When you access a file on an operating organisation, a file path is required. The file path is a string that represents the location of a file. It's cleaved upwardly into three major parts:

  1. Folder Path: the file folder location on the file organization where subsequent folders are separated by a frontward slash / (Unix) or backslash \ (Windows)
  2. File Name: the actual name of the file
  3. Extension: the end of the file path pre-pended with a period (.) used to betoken the file type

Hither's a quick example. Allow'due south say you have a file located within a file structure like this:

                                / │ ├── path/ |   │ │   ├── to/ │   │   └── cats.gif │   │ │   └── dog_breeds.txt | └── animals.csv                              

Let'south say you wanted to access the cats.gif file, and your electric current location was in the same binder every bit path. In order to admission the file, yous need to go through the path folder and then the to folder, finally arriving at the cats.gif file. The Folder Path is path/to/. The File Name is cats. The File Extension is .gif. So the full path is path/to/cats.gif.

Now let's say that your current location or electric current working directory (cwd) is in the to folder of our example folder construction. Instead of referring to the cats.gif by the total path of path/to/cats.gif, the file tin be simply referenced by the file proper name and extension cats.gif.

                                / │ ├── path/ |   │ |   ├── to/  ← Your current working directory (cwd) is hither |   │   └── cats.gif  ← Accessing this file |   │ |   └── dog_breeds.txt | └── animals.csv                              

But what about dog_breeds.txt? How would you access that without using the full path? You can use the special characters double-dot (..) to move ane directory up. This means that ../dog_breeds.txt will reference the dog_breeds.txt file from the directory of to:

                                / │ ├── path/  ← Referencing this parent folder |   │ |   ├── to/  ← Current working directory (cwd) |   │   └── cats.gif |   │ |   └── dog_breeds.txt  ← Accessing this file | └── animals.csv                              

The double-dot (..) can exist chained together to traverse multiple directories higher up the current directory. For instance, to admission animals.csv from the to folder, you lot would use ../../animals.csv.

Line Endings

1 problem often encountered when working with file data is the representation of a new line or line catastrophe. The line ending has its roots from back in the Morse Lawmaking era, when a specific pro-sign was used to communicate the finish of a transmission or the end of a line.

Afterwards, this was standardized for teleprinters past both the International System for Standardization (ISO) and the American Standards Association (ASA). ASA standard states that line endings should use the sequence of the Railroad vehicle Return (CR or \r) and the Line Feed (LF or \northward) characters (CR+LF or \r\n). The ISO standard however allowed for either the CR+LF characters or just the LF character.

Windows uses the CR+LF characters to bespeak a new line, while Unix and the newer Mac versions utilise just the LF character. This can cause some complications when you're processing files on an operating organization that is unlike than the file's source. Here'due south a quick example. Let's say that we examine the file dog_breeds.txt that was created on a Windows system:

                                Pug\r\n Jack Russell Terrier\r\n English language Springer Spaniel\r\n German language Shepherd\r\n Staffordshire Bull Terrier\r\north Cavalier King Charles Spaniel\r\northward Golden Retriever\r\northward West Highland White Terrier\r\n Boxer\r\northward Border Terrier\r\northward                              

This same output volition be interpreted on a Unix device differently:

                                Pug\r \n Jack Russell Terrier\r \n English language Springer Spaniel\r \due north German Shepherd\r \n Staffordshire Bull Terrier\r \n Condescending King Charles Spaniel\r \n Golden Retriever\r \n West Highland White Terrier\r \n Boxer\r \n Border Terrier\r \n                              

This tin can make iterating over each line problematic, and y'all may need to account for situations like this.

Character Encodings

Some other common problem that you lot may face is the encoding of the byte information. An encoding is a translation from byte information to human readable characters. This is typically done past assigning a numerical value to represent a character. The two most common encodings are the ASCII and UNICODE Formats. ASCII can simply store 128 characters, while Unicode can comprise upwards to 1,114,112 characters.

ASCII is actually a subset of Unicode (UTF-8), meaning that ASCII and Unicode share the aforementioned numerical to character values. It's of import to note that parsing a file with the incorrect character encoding can lead to failures or misrepresentation of the character. For case, if a file was created using the UTF-8 encoding, and you try to parse it using the ASCII encoding, if in that location is a character that is outside of those 128 values, then an mistake will be thrown.

Opening and Endmost a File in Python

When yous want to work with a file, the first affair to exercise is to open it. This is done by invoking the open() built-in role. open up() has a single required argument that is the path to the file. open() has a single return, the file object:

                                            file                =                open                (                'dog_breeds.txt'                )                          

Afterwards you open a file, the next thing to learn is how to close information technology.

It'due south of import to remember that information technology'due south your responsibleness to close the file. In about cases, upon termination of an awarding or script, a file will be closed somewhen. However, in that location is no guarantee when exactly that will happen. This can atomic number 82 to unwanted beliefs including resource leaks. Information technology'south too a best practise within Python (Pythonic) to brand certain that your lawmaking behaves in a way that is well defined and reduces any unwanted behavior.

When you're manipulating a file, at that place are two ways that you can utilise to ensure that a file is closed properly, even when encountering an fault. The commencement way to close a file is to utilise the endeavor-finally cake:

                                            reader                =                open                (                'dog_breeds.txt'                )                try                :                # Further file processing goes here                finally                :                reader                .                close                ()                          

If you're unfamiliar with what the try-finally block is, check out Python Exceptions: An Introduction.

The 2nd way to shut a file is to use the with statement:

                                            with                open                (                'dog_breeds.txt'                )                as                reader                :                # Further file processing goes here                          

The with statement automatically takes care of closing the file once information technology leaves the with block, fifty-fifty in cases of error. I highly recommend that you use the with statement as much every bit possible, as it allows for cleaner code and makes handling any unexpected errors easier for you.

Nearly likely, you'll likewise want to use the second positional statement, mode. This argument is a cord that contains multiple characters to represent how you want to open the file. The default and most common is 'r', which represents opening the file in read-only mode as a text file:

                                            with                open                (                'dog_breeds.txt'                ,                'r'                )                as                reader                :                # Further file processing goes here                          

Other options for modes are fully documented online, but the nearly unremarkably used ones are the following:

Graphic symbol Meaning
'r' Open for reading (default)
'w' Open for writing, truncating (overwriting) the file first
'rb' or 'wb' Open in binary mode (read/write using byte data)

Permit's go back and talk a little about file objects. A file object is:

"an object exposing a file-oriented API (with methods such as read() or write()) to an underlying resource." (Source)

There are three dissimilar categories of file objects:

  • Text files
  • Buffered binary files
  • Raw binary files

Each of these file types are divers in the io module. Hither'due south a quick rundown of how everything lines up.

Text File Types

A text file is the almost mutual file that you'll run across. Hither are some examples of how these files are opened:

                                                  open                  (                  'abc.txt'                  )                  open up                  (                  'abc.txt'                  ,                  'r'                  )                  open                  (                  'abc.txt'                  ,                  'west'                  )                              

With these types of files, open() will return a TextIOWrapper file object:

>>>

                                                  >>>                                    file                  =                  open up                  (                  'dog_breeds.txt'                  )                  >>>                                    type                  (                  file                  )                  <class '_io.TextIOWrapper'>                              

This is the default file object returned past open up().

Buffered Binary File Types

A buffered binary file blazon is used for reading and writing binary files. Hither are some examples of how these files are opened:

                                                  open up                  (                  'abc.txt'                  ,                  'rb'                  )                  open                  (                  'abc.txt'                  ,                  'wb'                  )                              

With these types of files, open up() will return either a BufferedReader or BufferedWriter file object:

>>>

                                                  >>>                                    file                  =                  open                  (                  'dog_breeds.txt'                  ,                  'rb'                  )                  >>>                                    blazon                  (                  file                  )                  <grade '_io.BufferedReader'>                  >>>                                    file                  =                  open                  (                  'dog_breeds.txt'                  ,                  'wb'                  )                  >>>                                    type                  (                  file                  )                  <form '_io.BufferedWriter'>                              

Raw File Types

A raw file blazon is:

"by and large used as a depression-level building-block for binary and text streams." (Source)

Information technology is therefore non typically used.

Here's an example of how these files are opened:

                                                  open up                  (                  'abc.txt'                  ,                  'rb'                  ,                  buffering                  =                  0                  )                              

With these types of files, open() volition return a FileIO file object:

>>>

                                                  >>>                                    file                  =                  open                  (                  'dog_breeds.txt'                  ,                  'rb'                  ,                  buffering                  =                  0                  )                  >>>                                    type                  (                  file                  )                  <class '_io.FileIO'>                              

Reading and Writing Opened Files

Once you've opened up a file, you'll want to read or write to the file. First off, permit's cover reading a file. There are multiple methods that can be called on a file object to help yous out:

Method What It Does
.read(size=-1) This reads from the file based on the number of size bytes. If no argument is passed or None or -one is passed, so the entire file is read.
.readline(size=-one) This reads at virtually size number of characters from the line. This continues to the end of the line and then wraps dorsum around. If no argument is passed or None or -1 is passed, then the unabridged line (or rest of the line) is read.
.readlines() This reads the remaining lines from the file object and returns them as a listing.

Using the same dog_breeds.txt file you used above, allow'due south go through some examples of how to use these methods. Here's an example of how to open and read the entire file using .read():

>>>

                                            >>>                                with                open                (                'dog_breeds.txt'                ,                'r'                )                as                reader                :                >>>                                # Read & print the unabridged file                >>>                                impress                (                reader                .                read                ())                Pug                Jack Russell Terrier                English language Springer Spaniel                High german Shepherd                Staffordshire Bull Terrier                Cavalier King Charles Spaniel                Golden Retriever                Due west Highland White Terrier                Boxer                Border Terrier                          

Here's an example of how to read five bytes of a line each time using the Python .readline() method:

>>>

                                            >>>                                with                open up                (                'dog_breeds.txt'                ,                'r'                )                every bit                reader                :                >>>                                # Read & print the start v characters of the line 5 times                >>>                                print                (                reader                .                readline                (                5                ))                >>>                                # Find that line is greater than the v chars and continues                >>>                                # down the line, reading 5 chars each fourth dimension until the end of the                >>>                                # line and and then "wraps" around                >>>                                print                (                reader                .                readline                (                five                ))                >>>                                impress                (                reader                .                readline                (                v                ))                >>>                                impress                (                reader                .                readline                (                5                ))                >>>                                impress                (                reader                .                readline                (                5                ))                Pug                Jack                Russe                ll Te                rrier                          

Hither'due south an example of how to read the entire file as a listing using the Python .readlines() method:

>>>

                                            >>>                                f                =                open                (                'dog_breeds.txt'                )                >>>                                f                .                readlines                ()                # Returns a list object                ['Pug\n', 'Jack Russell Terrier\northward', 'English Springer Spaniel\north', 'German Shepherd\n', 'Staffordshire Bull Terrier\n', 'Cavalier Male monarch Charles Spaniel\n', 'Gilt Retriever\northward', 'West Highland White Terrier\n', 'Boxer\n', 'Border Terrier\n']                          

The above instance tin can also be done by using list() to create a list out of the file object:

>>>

                                            >>>                                f                =                open up                (                'dog_breeds.txt'                )                >>>                                list                (                f                )                ['Pug\n', 'Jack Russell Terrier\n', 'English Springer Spaniel\n', 'German Shepherd\n', 'Staffordshire Bull Terrier\northward', 'Cavalier King Charles Spaniel\n', 'Gilt Retriever\north', 'West Highland White Terrier\due north', 'Boxer\north', 'Edge Terrier\north']                          

Iterating Over Each Line in the File

A common thing to practice while reading a file is to iterate over each line. Here's an example of how to utilize the Python .readline() method to perform that iteration:

>>>

                                                  >>>                                    with                  open up                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  equally                  reader                  :                  >>>                                    # Read and impress the entire file line by line                  >>>                                    line                  =                  reader                  .                  readline                  ()                  >>>                                    while                  line                  !=                  ''                  :                  # The EOF char is an empty cord                  >>>                                    impress                  (                  line                  ,                  end                  =                  ''                  )                  >>>                                    line                  =                  reader                  .                  readline                  ()                  Pug                  Jack Russell Terrier                  English Springer Spaniel                  German Shepherd                  Staffordshire Bull Terrier                  Condescending King Charles Spaniel                  Gilt Retriever                  West Highland White Terrier                  Boxer                  Edge Terrier                              

Some other mode you could iterate over each line in the file is to employ the Python .readlines() method of the file object. Remember, .readlines() returns a list where each chemical element in the list represents a line in the file:

>>>

                                                  >>>                                    with                  open                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  equally                  reader                  :                  >>>                                    for                  line                  in                  reader                  .                  readlines                  ():                  >>>                                    impress                  (                  line                  ,                  end                  =                  ''                  )                  Pug                  Jack Russell Terrier                  English Springer Spaniel                  German Shepherd                  Staffordshire Balderdash Terrier                  Cavalier King Charles Spaniel                  Golden Retriever                  West Highland White Terrier                  Boxer                  Edge Terrier                              

However, the above examples can be farther simplified by iterating over the file object itself:

>>>

                                                  >>>                                    with                  open                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  equally                  reader                  :                  >>>                                    # Read and print the unabridged file line by line                  >>>                                    for                  line                  in                  reader                  :                  >>>                                    impress                  (                  line                  ,                  end                  =                  ''                  )                  Pug                  Jack Russell Terrier                  English Springer Spaniel                  German Shepherd                  Staffordshire Balderdash Terrier                  Cavalier King Charles Spaniel                  Golden Retriever                  W Highland White Terrier                  Boxer                  Border Terrier                              

This last approach is more Pythonic and can be quicker and more memory efficient. Therefore, it is suggested you use this instead.

Now let's dive into writing files. As with reading files, file objects accept multiple methods that are useful for writing to a file:

Method What It Does
.write(string) This writes the cord to the file.
.writelines(seq) This writes the sequence to the file. No line endings are appended to each sequence item. Information technology's upwards to you to add the appropriate line ending(s).

Here's a quick example of using .write() and .writelines():

                                                  with                  open up                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  every bit                  reader                  :                  # Note: readlines doesn't trim the line endings                  dog_breeds                  =                  reader                  .                  readlines                  ()                  with                  open                  (                  'dog_breeds_reversed.txt'                  ,                  'w'                  )                  equally                  author                  :                  # Alternatively you could utilise                  # writer.writelines(reversed(dog_breeds))                  # Write the dog breeds to the file in reversed order                  for                  breed                  in                  reversed                  (                  dog_breeds                  ):                  writer                  .                  write                  (                  breed                  )                              

Working With Bytes

Sometimes, y'all may need to work with files using byte strings. This is washed past adding the 'b' character to the mode argument. All of the same methods for the file object apply. However, each of the methods expect and return a bytes object instead:

>>>

                                                  >>>                                    with                  open up                  (                  'dog_breeds.txt'                  ,                  'rb'                  )                  as                  reader                  :                  >>>                                    print                  (                  reader                  .                  readline                  ())                  b'Pug\n'                              

Opening a text file using the b flag isn't that interesting. Let's say nosotros accept this cute motion picture of a Jack Russell Terrier (jack_russell.png):

A cute picture of a Jack Russell Terrier
Paradigm: CC BY 3.0 (https://creativecommons.org/licenses/past/3.0)], from Wikimedia Commons

You can actually open that file in Python and examine the contents! Since the .png file format is well defined, the header of the file is viii bytes cleaved up similar this:

Value Interpretation
0x89 A "magic" number to bespeak that this is the start of a PNG
0x50 0x4E 0x47 PNG in ASCII
0x0D 0x0A A DOS style line catastrophe \r\north
0x1A A DOS mode EOF character
0x0A A Unix style line ending \n

Certain enough, when you open the file and read these bytes individually, you tin can see that this is indeed a .png header file:

>>>

                                                  >>>                                    with                  open                  (                  'jack_russell.png'                  ,                  'rb'                  )                  as                  byte_reader                  :                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  1                  ))                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  3                  ))                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  ii                  ))                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  1                  ))                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  1                  ))                  b'\x89'                  b'PNG'                  b'\r\north'                  b'\x1a'                  b'\n'                              

A Total Example: dos2unix.py

Let's bring this whole thing domicile and expect at a full instance of how to read and write to a file. The following is a dos2unix like tool that will catechumen a file that contains line endings of \r\n to \northward.

This tool is cleaved up into three major sections. The kickoff is str2unix(), which converts a string from \r\northward line endings to \n. The second is dos2unix(), which converts a string that contains \r\northward characters into \n. dos2unix() calls str2unix() internally. Finally, there'south the __main__ block, which is called only when the file is executed as a script. Think of information technology equally the master function institute in other programming languages.

                                                  """                  A simple script and library to convert files or strings from dos like                  line endings with Unix like line endings.                  """                  import                  argparse                  import                  bone                  def                  str2unix                  (                  input_str                  :                  str                  )                  ->                  str                  :                  r                  """                                      Converts the cord from \r\n line endings to \n                                      Parameters                                      ----------                                      input_str                                      The string whose line endings volition be converted                                      Returns                                      -------                                      The converted string                                      """                  r_str                  =                  input_str                  .                  replace                  (                  '                  \r\n                  '                  ,                  '                  \n                  '                  )                  return                  r_str                  def                  dos2unix                  (                  source_file                  :                  str                  ,                  dest_file                  :                  str                  ):                  """                                      Converts a file that contains Dos like line endings into Unix like                                      Parameters                                      ----------                                      source_file                                      The path to the source file to be converted                                      dest_file                                      The path to the converted file for output                                      """                  # Notation: Could add file being checking and file overwriting                  # protection                  with                  open                  (                  source_file                  ,                  'r'                  )                  as                  reader                  :                  dos_content                  =                  reader                  .                  read                  ()                  unix_content                  =                  str2unix                  (                  dos_content                  )                  with                  open                  (                  dest_file                  ,                  'w'                  )                  every bit                  writer                  :                  writer                  .                  write                  (                  unix_content                  )                  if                  __name__                  ==                  "__main__"                  :                  # Create our Statement parser and set up its description                  parser                  =                  argparse                  .                  ArgumentParser                  (                  description                  =                  "Script that converts a DOS like file to an Unix like file"                  ,                  )                  # Add together the arguments:                  #   - source_file: the source file we want to convert                  #   - dest_file: the destination where the output should go                  # Annotation: the use of the argument type of argparse.FileType could                  # streamline some things                  parser                  .                  add_argument                  (                  'source_file'                  ,                  help                  =                  'The location of the source '                  )                  parser                  .                  add_argument                  (                  '--dest_file'                  ,                  help                  =                  'Location of dest file (default: source_file appended with `_unix`'                  ,                  default                  =                  None                  )                  # Parse the args (argparse automatically grabs the values from                  # sys.argv)                  args                  =                  parser                  .                  parse_args                  ()                  s_file                  =                  args                  .                  source_file                  d_file                  =                  args                  .                  dest_file                  # If the destination file wasn't passed, then assume we want to                  # create a new file based on the old one                  if                  d_file                  is                  None                  :                  file_path                  ,                  file_extension                  =                  os                  .                  path                  .                  splitext                  (                  s_file                  )                  d_file                  =                  f                  '                  {                  file_path                  }                  _unix                  {                  file_extension                  }                  '                  dos2unix                  (                  s_file                  ,                  d_file                  )                              

Tips and Tricks

Now that you've mastered the basics of reading and writing files, hither are some tips and tricks to help you grow your skills.

__file__

The __file__ attribute is a special attribute of modules, like to __name__. It is:

"the pathname of the file from which the module was loaded, if information technology was loaded from a file." (Source

Here's a real world example. In one of my past jobs, I did multiple tests for a hardware device. Each examination was written using a Python script with the exam script file name used as a title. These scripts would then exist executed and could print their status using the __file__ special attribute. Here's an example folder construction:

                                project/ | ├── tests/ |   ├── test_commanding.py |   ├── test_power.py |   ├── test_wireHousing.py |   └── test_leds.py | └── chief.py                              

Running chief.py produces the following:

                                >>> python main.py tests/test_commanding.py Started: tests/test_commanding.py Passed! tests/test_power.py Started: tests/test_power.py Passed! tests/test_wireHousing.py Started: tests/test_wireHousing.py Failed! tests/test_leds.py Started: tests/test_leds.py Passed!                              

I was able to run and go the status of all my tests dynamically through use of the __file__ special attribute.

Appending to a File

Sometimes, you may want to append to a file or start writing at the end of an already populated file. This is easily done by using the 'a' grapheme for the way argument:

                                                  with                  open                  (                  'dog_breeds.txt'                  ,                  'a'                  )                  every bit                  a_writer                  :                  a_writer                  .                  write                  (                  '                  \n                  Beagle'                  )                              

When you examine dog_breeds.txt once again, you lot'll come across that the commencement of the file is unchanged and Beagle is at present added to the cease of the file:

>>>

                                                  >>>                                    with                  open                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  equally                  reader                  :                  >>>                                    print                  (                  reader                  .                  read                  ())                  Pug                  Jack Russell Terrier                  English Springer Spaniel                  High german Shepherd                  Staffordshire Balderdash Terrier                  Cavalier King Charles Spaniel                  Gold Retriever                  W Highland White Terrier                  Boxer                  Edge Terrier                  Beagle                              

Working With Two Files at the Same Time

There are times when you lot may want to read a file and write to some other file at the same fourth dimension. If y'all use the example that was shown when you were learning how to write to a file, it can actually be combined into the following:

                                                  d_path                  =                  'dog_breeds.txt'                  d_r_path                  =                  'dog_breeds_reversed.txt'                  with                  open up                  (                  d_path                  ,                  'r'                  )                  as                  reader                  ,                  open up                  (                  d_r_path                  ,                  'west'                  )                  as                  author                  :                  dog_breeds                  =                  reader                  .                  readlines                  ()                  author                  .                  writelines                  (                  reversed                  (                  dog_breeds                  ))                              

Creating Your Ain Context Managing director

There may come up a time when yous'll need finer command of the file object by placing it within a custom class. When you do this, using the with argument can no longer exist used unless yous add together a few magic methods: __enter__ and __exit__. By adding these, y'all'll accept created what'southward called a context managing director.

__enter__() is invoked when calling the with argument. __exit__() is called upon exiting from the with statement cake.

Hither's a template that you tin can use to brand your custom class:

                                                  class                  my_file_reader                  ():                  def                  __init__                  (                  self                  ,                  file_path                  ):                  cocky                  .                  __path                  =                  file_path                  self                  .                  __file_object                  =                  None                  def                  __enter__                  (                  cocky                  ):                  self                  .                  __file_object                  =                  open up                  (                  self                  .                  __path                  )                  render                  self                  def                  __exit__                  (                  cocky                  ,                  type                  ,                  val                  ,                  tb                  ):                  self                  .                  __file_object                  .                  shut                  ()                  # Additional methods implemented below                              

Now that you've got your custom class that is now a context managing director, y'all tin use it similarly to the open() built-in:

                                                  with                  my_file_reader                  (                  'dog_breeds.txt'                  )                  as                  reader                  :                  # Perform custom course operations                  pass                              

Here'south a good instance. Remember the cute Jack Russell paradigm we had? Perhaps you want to open up other .png files just don't desire to parse the header file each fourth dimension. Here's an case of how to do this. This example likewise uses custom iterators. If you're not familiar with them, check out Python Iterators:

                                                  class                  PngReader                  ():                  # Every .png file contains this in the header.  Use it to verify                  # the file is indeed a .png.                  _expected_magic                  =                  b                  '                  \x89                  PNG                  \r\n\x1a\n                  '                  def                  __init__                  (                  cocky                  ,                  file_path                  ):                  # Ensure the file has the correct extension                  if                  not                  file_path                  .                  endswith                  (                  '.png'                  ):                  raise                  NameError                  (                  "File must be a '.png' extension"                  )                  self                  .                  __path                  =                  file_path                  self                  .                  __file_object                  =                  None                  def                  __enter__                  (                  self                  ):                  self                  .                  __file_object                  =                  open up                  (                  self                  .                  __path                  ,                  'rb'                  )                  magic                  =                  self                  .                  __file_object                  .                  read                  (                  viii                  )                  if                  magic                  !=                  cocky                  .                  _expected_magic                  :                  raise                  TypeError                  (                  "The File is not a properly formatted .png file!"                  )                  return                  self                  def                  __exit__                  (                  self                  ,                  type                  ,                  val                  ,                  tb                  ):                  cocky                  .                  __file_object                  .                  close                  ()                  def                  __iter__                  (                  self                  ):                  # This and __next__() are used to create a custom iterator                  # See https://dbader.org/blog/python-iterators                  return                  self                  def                  __next__                  (                  self                  ):                  # Read the file in "Chunks"                  # See https://en.wikipedia.org/wiki/Portable_Network_Graphics#%22Chunks%22_within_the_file                  initial_data                  =                  self                  .                  __file_object                  .                  read                  (                  iv                  )                  # The file hasn't been opened or reached EOF.  This means we                  # can't go any further so stop the iteration by raising the                  # StopIteration.                  if                  self                  .                  __file_object                  is                  None                  or                  initial_data                  ==                  b                  ''                  :                  raise                  StopIteration                  else                  :                  # Each chunk has a len, type, information (based on len) and crc                  # Catch these values and render them as a tuple                  chunk_len                  =                  int                  .                  from_bytes                  (                  initial_data                  ,                  byteorder                  =                  'big'                  )                  chunk_type                  =                  self                  .                  __file_object                  .                  read                  (                  4                  )                  chunk_data                  =                  cocky                  .                  __file_object                  .                  read                  (                  chunk_len                  )                  chunk_crc                  =                  self                  .                  __file_object                  .                  read                  (                  4                  )                  render                  chunk_len                  ,                  chunk_type                  ,                  chunk_data                  ,                  chunk_crc                              

You tin can now open .png files and properly parse them using your custom context managing director:

>>>

                                                  >>>                                    with                  PngReader                  (                  'jack_russell.png'                  )                  as                  reader                  :                  >>>                                    for                  fifty                  ,                  t                  ,                  d                  ,                  c                  in                  reader                  :                  >>>                                    print                  (                  f                  "                  {                  l                  :                  05                  }                  ,                                    {                  t                  }                  ,                                    {                  c                  }                  "                  )                  00013, b'IHDR', b'v\x121k'                  00001, b'sRGB', b'\xae\xce\x1c\xe9'                  00009, b'pHYs', b'(<]\x19'                  00345, b'iTXt', b"Fifty\xc2'Y"                  16384, b'IDAT', b'i\x99\x0c('                  16384, b'IDAT', b'\xb3\xfa\x9a$'                  16384, b'IDAT', b'\xff\xbf\xd1\n'                  16384, b'IDAT', b'\xc3\x9c\xb1}'                  16384, b'IDAT', b'\xe3\x02\xba\x91'                  16384, b'IDAT', b'\xa0\xa99='                  16384, b'IDAT', b'\xf4\x8b.\x92'                  16384, b'IDAT', b'\x17i\xfc\xde'                  16384, b'IDAT', b'\x8fb\x0e\xe4'                  16384, b'IDAT', b')3={'                  01040, b'IDAT', b'\xd6\xb8\xc1\x9f'                  00000, b'IEND', b'\xaeB`\x82'                              

Don't Re-Invent the Ophidian

At that place are common situations that you may encounter while working with files. Most of these cases can be handled using other modules. Two common file types you may need to work with are .csv and .json. Real Python has already put together some dandy articles on how to handle these:

  • Reading and Writing CSV Files in Python
  • Working With JSON Data in Python

Additionally, there are born libraries out there that you can employ to help you:

  • moving ridge : read and write WAV files (sound)
  • aifc : read and write AIFF and AIFC files (audio)
  • sunau : read and write Sun AU files
  • tarfile : read and write tar archive files
  • zipfile : work with ZIP athenaeum
  • configparser : hands create and parse configuration files
  • xml.etree.ElementTree : create or read XML based files
  • msilib : read and write Microsoft Installer files
  • plistlib : generate and parse Mac Os Ten .plist files

There are plenty more out at that place. Additionally at that place are even more third party tools bachelor on PyPI. Some popular ones are the post-obit:

  • PyPDF2 : PDF toolkit
  • xlwings : read and write Excel files
  • Pillow : image reading and manipulation

You're a File Wizard Harry!

You did it! Yous now know how to work with files with Python, including some avant-garde techniques. Working with files in Python should at present be easier than always and is a rewarding feeling when you start doing it.

In this tutorial yous've learned:

  • What a file is
  • How to open and close files properly
  • How to read and write files
  • Some avant-garde techniques when working with files
  • Some libraries to work with mutual file types

If yous accept any questions, striking us up in the comments.

Spotter Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your agreement: Reading and Writing Files in Python

doughertygreasse.blogspot.com

Source: https://realpython.com/read-write-files-python/

Post a Comment for "How to Read a File Into Python"