Homework 1

Introduction

This homework will cover basic Java control structures, 1-D arrays, file I/O, and string manipulation.

Downloadables

Problem Description

You are a Georgia Tech professor teaching a class. It is the end of the semester and you want to visualize the distribution of your students’ grades.

Solution Description

Write a single Java class that, when run as a console program in a directory containing a properly formatted grades file (more below), prints a histogram of the grade distribution to STDOUT (the console). Name your class Histogram and save its source code in an appropriately named file.

A histogram is a graphical representation of a sequence of “bins” which represent ranges of values. Each bin contains a number which represents the number of occurences of data points that lie within the range of values represented by the bin. Your program will take two inputs (more later): the name of a data file and the number of bins to create in the histogram. You will represent these bins as elements of an array and your program will count the number of occurrences of values from the data file that lie within the ranges represented by each bin. The data values will lie in the range [0, 100]. For example, if we had a data file with these contents:

Dead Starks, 6
Dead Lannisters, 3
Luftballons, 99

and you create a histogram with two bins, the array to represent the bins would contain [2, 1], where the first element of the array contains the count of values in the interval [0, 50] and the second element contains the count of values in [51, 100].

After counting the number of grades in each bin, your program will print a horizontal histogram. You must also label the range of each bin, which can be derived from the number of bins and the range of data values. Each bin should have the same sized range except if the number of bins does not divide 101 evenly, the extra should be included in the lowest bin.

The Histogram should look something like this:

100 - 91 | [][][][][][][][][][][][][]
 90 - 81 | [][][][][][][][][][][][][][][][][][][][][][][][][][]
 80 - 71 | [][][][][][][][][][][][][][]
 70 - 61 | [][][][][][][][][][][][][][][][]
 60 - 51 | [][][][][][][][]
 50 - 41 | [][][][][]
 40 - 31 | [][][][][][][]
 30 - 21 | [][]
 20 - 11 |
 10 -  0 | [][]

The above histogram has 10 bins. Because there are 101 different possible grades (0-100) and 101 is not divisible by 10, we make the smallest bin include the extra. The class had 2 grades in the range 10-0, 0 grades in the range 11-20, 2 grades in the range 21-30, etc.

Grades File

We have provided you with a CSV file that has a list of students and their grades. A CSV file is just a text file with data partitioned by commas and (in this case) newlines. Note that there may be any number of spaces surrounding the comma.

These grades are not sorted but they are integers bound between 0 and 100 (inclusive). For example, the file may look like:

Glenn Hollingsworth,91
Chris Simpkins, 100
Thomas Shields, 89
Bob,55
Alice,   95
Eve, 87

Your program may read through the file only once. Do not scan through the contents of the file multiple times.

The file we provide is just an example of what the file could be. You should test your code with other grades files and other bin sizes.

Input

You will use command-line arguments to inform your program the location of the grades file - see Expected Output for how to pass the file name in when running the program.

You must allow the user to specify the number of bins in the two following ways:

  1. Firstly, the number of bins may be specified as an additional command line arg, e.g. java Histogram grades.csv 5
  2. If the second command line arg is not present, your program must ask the user for the number of bins at the beginning.

Expected Output

Running the program should look like this:

Note: $ is the command prompt on Unix. On Windows it will look like C:>

IMPORTANT: The spacing here is very important. You must use the same spacing scheme as our examples or you will lose points. Also make sure the prompt for the number of bins is the same as our example.

Allowed imports

You are allowed to import the following classes and only the following classes:

Tips

  1. You may assume that you always get valid input.
  2. You may assume the text file has valid numbers.
  3. 101 is a prime number.
  4. Try using 101 as the number of bins before you submit.
  5. Try using printf.
  6. An array is a fixed size data structure; you need to know ahead of time how big it needs to be. How do we do this?
  7. You can give interpretations to the indices and contents of an array to arrive at creative solutions to problems. Code smart, not hard.
  8. Creating a Scanner object with a file will throw a checked exception. Don’t worry about what this means — for now, just append throws Exception to the end of the main method signature wherein the file is opened.

Grading

Checkstyle

For each of your homework assignments we will run checkstyle and deduct one point for every checkstyle error.

For this homework the checkstyle cap is 0, meaning you won’t lose points on this assignment due to style errors. This limit will increase with each homework.

Collaboration

When completing homeworks for CS1331 you may talk with other students about:

Examples of approved/disapproved collaboration:

OKAY: “Hey, I’m really confused on how we are supposed to implement this part of the homework. What strategies/resources did you use to solve it?”

BY NO MEANS OKAY: “Hey… the homework is due in like 20 minutes… Can I see your code? I promise won’t copy it directly!”

In addition to the above rules, note that it is not allowed to upload your code to any sort of public repository. This could be considered an Honor Code violation, even if it is after the homework is due.

Submission

Have fun!