Data Science Update IV


A full year has passed since I decided to start this blog and document my journey to becoming a data scientist. I’m happy to say that the last three months have been the most productive so far!

My previous update can be found here.

To keep accountable, this post will summarize the goals that I set forth three months ago, and set goals for the upcoming six months. That means you can expect the fifth Data Science Update installment to be posted on July 1st, 2019.


Pass my Coursework

Success!

By far the most time-intensive and important accomplishment this quarter was completing the first semester of graduate school. I’ve completed my coursework in Probability Theory (B), Statistics for Experimenters (A), and Programming in SAS / R (A).

I’ve learned more about statistics than I ever thought possible in a single semester. Working in the Statistics tutoring lab for 20 hours a week ensure continuous exposure to fundamental topics in statistics.


Complete 2 Blog Posts

Success!

I went above and beyond, publishing three of my best posts to date. The Slot Machine Function, written in R, is fairly long but well explained. L.A. Parking Citations - EDA Part I and L.A. Parking Citations - EDA Part II are my first posts written in python, were an absolute joy to make.

I’ve completed most of the third and fourth installments to this project, however they’re still in process of being edited and posted to this site.


Update Fisherankney.com

Success!

This site has been updated several times, it’s a continuous work in progress.


Complete and Review 7 Coding Resources

Semi-Success!

In a blaze of ambition, I attempted to complete at review 7 different programming resources in the last three months. The resources are:

  • Hands on Programming with R by Garrett Grolemund (Success)
  • Statistics in R by by Sudha Purohit (Success)
  • R in Action, Second Edition by Robert Kabacoff (Fail)
  • Learn Python the Hard Way by Zed Shaw (Success)
  • Think Python by Allen B. Downey (Success)
  • Automate the Boring Stuff by Al Sweigart (Fail)
  • SQL - MySQL for Data Analytics and Business Intelligence (Udemy Course) (Fail)

As you can see, I successfully worked through, and reviewed, 4/7 of the resources. You can check out what I thought about each of the resources here. As for the remaining resources, their fates are mixed.

I will complete the SQL course within the next few weeks. I’ve deemed SQL to be an extremely important technology that I can no longer ignore. ‘Automate the Boring Stuff’ will be deferred until late 2019, or possibly early 2020. And I’ve nixed R in Action all together. I’m bored of working through remedial R resources.


In The Next 6 Months

To better align with the semester schedule, I’m re-organizing these updates to encompass 6-months of goals. I’ve got big plans for the next half year; and while it’s hard to say exactly what will be accomplished, this is what I’m aiming for.


M.S. Coursework

Spring of 2019 is my second semester in graduate school, and I suspect the workload will be similar to this semester’s. I’ll be completing the following courses:

  • Applied Regression Analysis
  • Statistics for Experimenters II
  • Mathematical Inference

I’ll be working 20 hours a week as a graduate teaching assistant as well. While the grades that I earn in these classes aren’t terribly important, C’s are unacceptable to the department and must be avoided at all costs.


Python

I’ll be focusing on Python a lot in the next six months. For explicit Python practice, I want to complete these tasks:

  • Complete 100 HackerRank assignments
  • L.A. Parking Citations - EDA Part III
  • L.A. Parking Citations - EDA Part IV
  • Finish Python Data Science Handbook

The HackerRank assignments will allow me to continually develop my core programming skills, and the Python Data Science Handbook will give me more domain-specific knowledge. On top of that, a lot of my machine learning goals incorporate Python, and are expanded upon below.


Tableau

I’ve heard mixed reviews about Tableau from the data science community. It cannot be denied that Tableau is a popular technology, and for that reason I think it’s worth my time to check out. Interactive dashboards and data visualizations are great fun, but I want to choose wisely between R Shiny, Python Dash, and Tableau.

  • Sign up for Tableau Free Edition
  • Blog Post on Tableau Project
  • Blog Post on How-to-Tableau
  • Blog post on Tableau vs. Shiny vs Dash


MySQL

While SQL isn’t an exciting, cutting-edge, technology like tensorflow or AWS, it is an essential skill for any data scientist to master. How am I supposed to gather insight from data if I can’t gather it properly?

I’ll be completeing the following tasks -


Machine Learning

After a year of hard work on the fundamentals of data science, I’m so excited to jump into machine learning and predictive analytics.

  • Andrew Ng Machine Coursera Course
  • Fast.ai Practical Deep Learning
  • Fast.ai Cutting Edge Deep Learning for Coders
  • Fast.ai Introduction to Machine learning for Coders
  • Complete Introduction to Statistical Learning
  • Complete Hands on Machine Learning and Tensorflow
  • Compete in a Kaggle Competition


Andrew Ng’s machine learning course on Coursera is widely acclaimed as one of the best foundational resources for the subject matter. Working through the book ‘Introduction to Statistical Learning’ in R will allow me to implement new skills as I learn them, hopefully making it an extremely effective learning resource. The Fast.ai courses are highly recommended throughout the community, and I’ve been dreaming competing in a Kaggle competition for well over a year now.


Conclusions

While it’s hard to pin down an entire six months of learning, I think this is a great set of goals to work towards! To be honest, I’m just now realizing how complex data science is.

In the end, I think it’ll all be worth it. It’s encouraging that there are so many ways to ‘level up’ and become more skilled in this field. I think that I’ll really appreciate the breadth of this field, once I officially break into it!

Thanks for following me on the first year of this journey, here’s to the next!


Until next time,
- Fisher

Comments