Milestone I: Data Preparation

Part 1: Read In The Stock Data

One of the first goals for the project will be to import the financial data you were given into your program, provided to you as a CSV formatted text file. Get started by downloading both stock data for Apple, and stock data for Microsoft as this is the data you will use for testing.

The data provided has the following columns, click on the links for their full definitions: Date, Open (2), Close, High, Low, Volume, and Adj Close.

CSV files are a relatively simple text file format. Try to inspect the data to become familiar with what it contains (you can inspect the data by opening it in your favorite text editor, like TextEdit, Notepad, Vim, Sublime Text, etc.

The data you will read in is linear by date (but, non-contiguous due to holidays and weekends,) reflecting a timeline of stock performance in reverse chronological order; however your program should run through the days in the data from the earliest date to the most recent.

Now, ask yourself: is the data arranged how you would like it?

You can write whatever code you need to get the job done. The only strictly defined and required function is one named test_data(), and it should be a simple function, relying on calling other functions you have written. It should have the following characteristics, to be filled in by you (along with the docstring):

def test_data(filename, col, day):
    """A test function to query the data you loaded into your program.

    Args:
        filename: A string for the filename containing the stock data,
                  in CSV format.

        col: A string of either "date", "open", "high", "low", "close",
             "volume", or "adj_close" for the column of stock market data to
             look into.

             The string arguments MUST be LOWERCASE!

        day: An integer reflecting the absolute number of the day in the
             data to look up, e.g. day 1, 15, or 1200.

    Returns:
        A value selected for the stock on some particular day, in some
        column col. The returned value *must* be of the appropriate type,
        such as float, int or str.
    """

This function is used primarily for me to test your code for grading. It is not intended to be used in later Milestones, and it will be problematic if you do attempt to use it in later Milestones. Future Milestones will rely on other functions that you hopefully will write to support your test_data() function.

This is how I will test your code, in the file project.py (where 1 is the 1st day, and 25 is 25th day of data), for example, using the Apple stock data:

>>> val = test_data("AAPL.csv", "close", 25)  # Close data for 25th day for AAPL.
>>> print(val)
21.125
>>> type(val)
<class 'float'>
>>>

or, using the Microsoft stock data:

>>> val = test_data("MSFT.csv", "close", 25)  # Close data for 25th day for MSFT.
>>> print(val)
62.5625
>>> type(val)
<class 'float'>
>>>

You should write a main() function, but it shouldn’t do anything yet. Feel free to write main() like this:

def main():
    pass  # Do nothing, just passing through!

Part 2: Stock Bookkeeping

Another primary aspect to work on is bookkeeping! You are given an ‘account’ with $1000 for buying and selling stocks. How do you represent this in your code? How would you track how many stocks you currently own?

This can be done with a couple simple variables holding your balance and stock count, or you may want a more complicated system involving functions that properly control how your money and stocks are manipulated. It’s your choice!

Helpful Hints

  1. It would be good form to have separate functions to open, read, and parse the CSV file all to be called inside test_data().
  2. There is no need to ask the user for a filename. Unit test your function by calling it inside the interpreter with the filename for your data you want to use.
  3. Global variables are useful in this project: easy to use, but easy to abuse. Use them judiciously. This project can be done without using global variables, however.
  4. Lists are virtually required here, so make sure you call help(list) in the Python interpreter to know what a list can do for you!
  5. You will not be allowed to edit the CSV file directly. You can only change the data after you read it in using your Python program.
  6. The stock data isn’t contigious in its dates listed. You will have gaps that correspond to weekends or market holidays. Do not count those missing dates as part of your day number counting.
  7. Don’t trust Excel to give you the true view of the CSV data, use a text editor.

Milestone I Submission Requirements

  1. Data File (10 points)
    Your code should work with a pristine, unedited copy of the data I provided you with the correct filename. Any changes to the data need to be done directly in your Python code.
  2. Data Handling (10 points)
    You are free to read in the data using any programming techniques you think works for you. Also, this data must be stored in a way that makes later referencing of stock price easy.
  3. Bookkeeping (8 points)
    The later milestones will need to know how much money and stocks you own in order to allow you to make sensible decisions. There needs to be appropriate code that will be used to keep track of both.
  4. Testing (6 points)
    I will test your code, and therefore I require that you create a function named test_data() as specified above, where:
    • The function takes as arguments one string parameter of open, high, low, close, volume, and adj_close, and an integer for the day in the data you want to refer to.
    • The function returns the float value of the stock data for the particular column and day number.

Points may be generally deducted across all categories in cases of syntax, style or running issues.