Browsing Category

Data

Data

DIY Data Science, Part 4: G is for Graph Analysis

July 22, 2017
networkanalysis

It’s been tricky keeping this series up! I’ve started learning about a lot of things at the same time and it’s been quite difficult to neatly isolate one bit to write about. A few weeks ago, however, I decided my social network analysis skills (or, as I like to call them here, my graph analysis skills – poor G was an unpopular letter while S was in high demand) needed a little refresher. I dedicated a big chunk of my Master’s degree last year to network analysis, and I loved it, but I hadn’t done anything in that space since. I mainly just wanted to check that I still had it – that I was still able to extract and build a network from messy social data without having a minor or major nervous breakdown. Continue Reading…

Data

DIY Data Science, Part 3: W is for Web Scraping

May 22, 2017
diydata3

I’ve been quite excited to dive into web scraping. After all, my teenage self used to copy and paste HTML code like crazy to put together amateur website templates, and scraping is also maybe the most intuitive way of collecting data online – just grab what you see. In practice, it’s obviously a little trickier than that. Continue Reading…

Data

DIY DATA SCIENCE, PART 2: M IS FOR MACHINE LEARNING

May 1, 2017
machinelearning2

I’m keeping it short & sweet this week; many of the letters I’ve chosen for this project will focus on machine learning algorithms in more detail, so here’s a tiny visual overview instead. For reference, I’m planning on watching all of these lovely lectures in the future and potentially invest in this book to get a little more grounding in the maths behind machine learning (hello darkness, my old friend).

One of the lovely things about this project is also that there’s no point in being a perfectionist; I won’t be able to produce flawless content every week, so it’s nice to embrace ‘good enough’, mix it up (like this week) and make it a priority to keep going. Continue Reading…

Data

DIY Data Science, Part 1: P is for Pandas

April 23, 2017
PisforPandas

This last week, my programming brain has been all over the place. I’ve been looking into Natural Language Processing and trying to learn how to build a web application in Flask and it all felt like a huge complicated mess in my mind for most of the week. So I decided to turn to the familiar for comfort and kick off DIY Data Science with a topic that is near and dear to my heart – Pandas, Python’s superstar data analysis package.

Hello Pandas, my old friend

This is not the first time. Last December, after a few months of non-coding, I was desperate to get back into it. So my friend Zack offered to help me out and set me a little data cleaning and wrangling challenge to solve using Pandas. Continue Reading…

Data

DIY Data Science: The Alphabet

April 13, 2017
abc

A few weeks ago, I decided to properly commit to learning some of the many open-source tools and skills and concepts that can help you get your hands dirty with data to dig out valuable insights, communicate them successfully and make them look pretty.

Since data science can be an overwhelming field and there’s an abundance of free online resources on offer – some amazing, some of questionable quality – I realised that I would need to structure this somehow. Continue Reading…

Data

What I Talk About When I Talk About Coding

April 1, 2017
coding

Recently, I’ve had a little bit of time to get back into coding and once again realised how much I enjoy it. Python is my language of choice, and I’m by no means an expert (though I’m pretty sure I can get there one day if I keep going). So far, I mostly focus on data collection, cleaning, analysis and visualisation. I only started coding a little over a year ago during my Masters degree; that is, if you don’t count my teenage self’s dabbling in HTML and CSS to make my first ever website look nice – yes, I was definitely one of the cool kids. Continue Reading…