Data Science Update III


Nine months! Three seasons! Two-hundred and seventy days of data science!

Early on, I set a goal for myself to periodically report my data science progress and put forward concrete plans for the near future in Data Science Updates on my blog. This is my third such update, and I’m happy to say that I’ve been productive in the last three months, and I plan on taking it to the next level in the next three.

My second update can be found here and my first update can be found here.

The Last 3 Months

To keep myself accountable, I set out these 5 goals in the previous update:

  • Move to Stillwater Oklahoma and begin M.S. Statistics Program
  • On track to earn A’s in each of my classes
  • Exceed expectations in GTA work and research
  • Read Introduction to Statistics
  • Complete 5 more blog posts


Move to Stillwater, Oklahoma

Complete

I moved to Stillwater, Oklahoma on August 1st, and have been living here for two whole months! While it’s extremely hot here, and very flat, I’m doing alright. There’s a Starbucks, and a Qdoba, so I think I’ll be able to manage.


On Track to Earn the A

Complete

I’m currently earning a 100% in my Statistics for Experimenters I course, a 93% in my SAS course, and a 98% in my Probability Theory course. I do have a Probability Theory exam coming up which I’m studying for, but I think I’ll do just fine.


Exceed Expectations in GTA and Research

Complete

There isn’t that much to say here. GTA duties include tutoring in the Statistics-help room for 20 hours a week. Potential research includes creating an R package for a specific statistical analysis technique.


Read Introduction to Statistics

Fail

I got about 150 pages into this book and found it was a bit elementary for me. I’m ok with not finishing this goal.


Complete 5 Blog Posts

Complete

I’m very happy to report that I’ve complete 5 blog posts and I’m quite proud of them. My first post of the quarter, Casino Games - Roulette is my longest analysis yet. I’ve finally finished my Unpacking the Tidyverse series, the forcats and purrr posts were especially fun to create. Finally, Casino Games - Dice was a short post that I quickly put together to spell out some ideas related to probability. It came together in less than 3 hours and I’m very happy with the finished product.

Overall, I’ve accomplished a tremendous amount in the last 3 months, especially considering I spent most of July in Yellowstone and Grand Teton National Parks.


The Next Three Months

Setting goals is always my favorite part of …well everything. In the next three months, I hope to accomplish many things.


Pass my Coursework

I’ve figured out that GPA doesn’t matter anymore in graduate school. I need to make B’s in my courses, and I plan on doing so. If I get an A, then great! But I’d rather invest that time difference into studying computational statistics and data science.


Complete 2 Blog Posts

That’s right, only two. I’m shifting my focus away from short EDA projects and package tutorials for this quarter. I plan on finishing up my Casino Games series by creating ambitious posts on card games and slot machines in R.


Update Fisherankney.com

As always, I wish to improve my website / portfolio. Small, iterative improvements make a big difference over time. This quarter, I hope to create a resource review section, where I’ll do just what the title suggests. I’ve been burning through a lot of Data Science resources, and I think it would be beneficial to myself and others if I compiled, summarized, and rated the resources I’ve used.


Complete and Review 6 Coding Resources

The resources are -

  • Hands on Programming with R by Garrett Grolemund
  • Statistics in R by by Sudha Purohit
  • R in Action, Second Edition by Robert Kabacoff
  • Learn Python The Hard Way by Zed Shaw
  • Think Python By Allen B. Downey
  • Automate the Boring Stuff by Al Sweigart
  • SQL - MySQL for Data Analytics and Business Intelligence (Udemy Course)

That’s three R books focusing on data anylsis, 3 Python books focusing on building a solid programming foundation. and 1 MySQL course focused on buisness applications. This will give me a good foundation to jump into more advanced data analysis techniques in R next semester (Advanced R by Hadley Wickham, Efficient R Programming by Collin Gillespie, and more). I will also be building out my programming skills in Python extensively. I’m eyeing up the Python Data Science Handbook for early 2019.


An ambitious quarter, but I think I’m up to the challenge.

Until next time,
- Fisher

Comments