My computer AKA my first big data machine

Summary: Intro | Virtualisation Software | Cloudera’s QuickStart VM | Importing a VM

In this post, I will introduce Virtual Machines: the core platform of every data scientist. If you, like me, get to experiment with different technologies at work, you are familiar with Virtual Machines. VMs are the best way of getting to test something out without having to install it on your computer and risking messing up your working environment. In its essence, a VM is like a mini (virtual!) computer you put on your computer; that computer has its own environment, like Windows, Linux or MacOS, and it would usually come with a bunch of pre-installed and configured tools, so that you don’t have too worry about any (or much) setup. So you might have a Windows machine installed on your actual¬†Windows machine, and while these two share computing resources and space, they are separate instances of Windows. Plus, the virtual machine you can delete or change as you please, you can have many and, by definition, this has no impact on your original working environment.

Continue reading “My computer AKA my first big data machine”

Hello World

A traditional first post, then!

I want this space to become a journal of my wanderings in the world of data analysis. While I’ve been working as a consultant for a few years now, there is a magnitude of topics I have never tackled and technologies I know nothing about. The idea for this learning space is to tackle a different problem every week-two weeks.

I am no writer, and my previous experience is in creating functional specification where every term had to be precise, and the sentences kept short and simple. In my Business Analyst beginnings the documentation I produced was poor, but with time I saw my writing quality go up. With that in mind I think it will be a learning curve to learn ‘blogging’ (but I still keep the positive outlook).


Eve the Analyst