Input/Output: Files

Dr Andy Evans

[Fullscreen]

  • Files
  • File types

     

  • Dealing with files starts with encapsulating the idea of a file in an object

File locations

Captured in two classes
  • java.io.File : Encapsulates a file on a drive.
  • java.net.URL : Encapsulates a Uniform Resource Locator (URL), which could include internet addresses.

java.io.File

  • Before we can read or write files we need to capture them. The File class represents an external file.
    	
    File(String pathname);
    File f = new File("e:/myFile.txt");
    
    
  • However, we must remember that different OSs have different file systems.
  • Note the use of a forward slash.
  • Java copes with most of this, but "e:" wouldn't work in *NIX / a Mac / mobiles etc.

Getting file locations

  • java.awt.FileDialog : Opens a "Open file" box with a directory tree in it. This stays open until the user chooses a file or cancels.
  • Once chosen use FileDialog's getDirectory() and getFile() methods to get the directory and filename.

 

Getting file locations

	
import java.awt.*;
import java.io.*;

	
FileDialog fd = new FileDialog(new Frame());
fd.setVisible(true);
File f = null;

if((fd.getDirectory() != null)||
			    (fd.getFile() != null)) { 
    f = new File(fd.getDirectory() + fd.getFile());
}

Application Directory

  • Each object has a java.lang.Class object associated with it. This represents the class loaded into the JVM.
  • One use is to get resources local to the class, i.e. in the same directory as the .class file. We use a java.net.URL object to do this.
    	
    Class thisClass = getClass();
    URL url = thisClass.getResource("myFile.txt");
    
    
  • We can then use URL's getPath() to return the file path as a String for the File constructor.

Useful methods

  • exists(), canRead() and canWrite(): Test whether the file exists and can be read or written to.
  • createNewFile() and createTempFile(): Create a new file, and create a new file in "temp" or "tmp".
  • delete() and deleteOnExit(): Delete the file (if permissions are correct). Delete when JVM shutsdown.
  • isDirectory() and listFiles(): Checks whether the File is a directory, and returns an array of Files representing the files in the directory. Can use a FilenameFilter object to limit the returned Files.
  • Files
  • File types

     

  • As we'll see, the type of the file has a big effect on how we handle it.

Binary vs. Text files

  • All files are really just binary 0 and 1 'bits'.
  • In 'binary' files, data is stored in binary representations of the primitive types:

    8 bits (e.g. 00000000) = 1 byte

    	
    00000000 00000000 00000000 00000000 	= int 0
    00000000 00000000 00000000 00000001 	= int 1
    00000000 00000000 00000000 00000010		= int 2
    00000000 00000000 00000000 00000100		= int 4
    00000000 00000000 00000000 00110001 	= int 49
    00000000 00000000 00000000 01000001 	= int 65
    00000000 00000000 00000000 11111111 	= int 255
    
    

Binary vs. Text files

  • In text files, which can be read in notepad++ etc. characters are stored in smaller 2-byte areas by code number:
    	
    00000000 01000001 =  code 65  = char  "A"
    00000000 01100001 =  code 97  = char  "a"
    
    

Characters

  • All chars are part of a set of 16 bit international characters called Unicode.
  • These extend the American Standard Code for Information Interchange (ASCII) , which are represented by the ints 0 to 127, and its superset, the 8 bit ISO-Latin 1 character set (0 to 255).
  • There are some invisible characters used for things like the end of lines.
    	
    char back = 8;  // Try 7, "bell" as well.
    System.out.println("hello" + back + "world");
    
    
  • The easiest way to use stuff like newline characters is to use escape characters (see website).
    	
    System.out.println("hello\nworld");
    
    

Binary vs. Text files

  • Note that :
    	
    00000000 00110001 =  code 49  = char  "1"
    
    
    Seems much smaller - it only uses 2 bytes to store the character "1", whereas storing the int 1 takes 4 bytes.
  • However each character takes this, so:
	
00000000 00110001 
=  code 49  = char  "1"
00000000 00110001 00000000 00110010 
=  code 49, 50  = char  "1" "2"
00000000 00110001 00000000 00110010 00000000 
00110111  = code 49, 50, 55  = char  "1" "2" "7"

  • Whereas with an single int of 4 bytes, we can store 127, thus:
	
00000000 00000000 00000000 01111111 = int 127

Binary vs. Text files

  • In short, it is much more efficient to store anything with a lot of numbers as binary (not text).
  • However, as disk space is cheap, networks fast, and it is useful to be able to read data in notepad etc. increasingly people are using text formats like XML.
  • As we'll see, the filetype determines how we deal with files.

Review

	
File f = new File("e:/myFile.txt");

  • Three methods of getting file locations:
    1. Hardwiring
    2. FileDialog
    3. Class getResource()
  • Need to decide the kind of file we want to deal with.