Data Science Update


I’m now three months into my journey to become a data scientist. I finally made the decision to fully pursue data science near the end of my around-the-world trip in October of 2017. Upon returning to the United States, I almost immediately took the GRE and started my application to statistics graduate programs to solidify my transition. On January 1st, 2018, I stopped planning the transition and started making it happen. To keep motivated and accountable on this path, I’m recording a quarterly update to summarize what I’ve accomplished in the last three months and what I hope to accomplish in the next three.


M.S. Statistics Applications Submitted

I have an undergraduate degree in geophysics; and while my undergraduate education is heavy in quantitative scientific methods, I’m still lacking what I perceive to be proper data science credentials. I’ve applied to seven programs across the country, with the goal of receiving full funding to complete the degree. The M.S. Statistics degree is in high demand, and thus very competitive; as of today, my application results are as follows:

  • Iowa State University - Unknown
  • Colorado State University - Rejected
  • University of Vermont - Accepted, 1/2 Funding
  • University of New Hampshire - Accepted, Unknown Funding
  • Kansas State University - Accepted, Unknown Funding
  • Oklahoma State University - Unknown
  • Utah State University - Unknown

The deadline for enrollment decisions is April 15th, so the next two weeks should be exciting!


FisherAnkney.com

One of the accomplishments that I’m most proud of is the creation of this website! FisherAnkney.com went from an idea to a fully functioning interactive website in less than a week. My last pet project, Wayfaring Worker, gave me a lot of practical experience in creating and hosting a personal website. FisherAnkney.com is hosted on GitHub pages and created with Jekyll, the static site generator. Based on the hydeout theme, FisherAnkney.com is optimized for speed and mobile devices.

Typically, posts are written in RMarkdown, in the RStudio IDE, and then exported as markdown files to my favorite text editor, Visual Studio Code. Once they’re polished up and ready to be published to the website, a simple git push command is all it takes to take the post live. This makes for a quick and simple workflow that encourages me to complete and post my data science projects for all to see.


10 Blog Posts

Using the process described above, I’ve completed 10 blog posts in the last three months; that’s almost a blog post every single week! My favorite blog post so far is Text Mining the Hobbit.

Learning data science through the creation of blog posts is an extremely rewarding experience. By creating blog posts, you’re forced to finish your projects, explaining the methods and results in your own words. Projects become contained in scope, and specific in their goals; it becomes easy and fun to explore new packages, methods, and ideas.

If you’re on the fence about creating a technology blog, I absolutely encourage you to dive right in and give it a shoot!


Read R for Data Science

R for Data Science by Hadley Wickham is an excellent introductory book on the methodology of data science through the R programming language. At the equivalent of 550 pages, R4DS took a while for me to work through it completely. I would estimate that I’ve tripled my competency in R by thoroughly working through R4DS, and I couldn’t recommend it enough.


Complete 10 courses on Data Camp

Totally around 40 hours of coursework, Data Camp is an interactive online resource that I used to explore a variety of technologies. Some of the courses I completed include Introduction to git, shell, geospatial analysis, and the stringr package. While I found Data Camp to be a helpful resource I’ve completed my one month free trail and I don’t plan on purchasing a subscription.


17 R Packages

As a side effect of the above achievements, I’ve become familiar with 17 new R packages in the past 3 months. Some of my favorite packages are of course those in the core tidyverse, tmap, classInt, viridisLite, and sp. Gaining exposure to these packages to tackle a variety of problems in R has made me a more flexible and effective R user. Whenever I run into a problem in R, there’s always a package ready to help!


The Next Three Months

All in all, I’m proud to say that I’ve consistently gotten better at R and data science over the last three months; but I plan on accomplishing even more over the next three months.

By July 1st, 2018, I plan to fully enroll in a statistics graduate program that will start this fall. I’ve already selected two books to read and work through: the first is Introduction to Statistical Learning and the second book is Introduction to Statistics. I also plan on creating 10 more blog posts, and updating the layout of fisherankney.com.

Thanks for following me on this journey, and check in for Data Science Update II on July 1st, to what happens when expectations meet reality!

until next time,
- Fisher

Comments