Hello there and welcome to my website. This website is geared toward giving you a little insight into what I do which is

to teach people about computers in a way that they’ve never dreamt to experience, a fun way, a fast way, an approachable way.

and to teach no matter what professional walk of life they come from how young or old or what their gender, race, tribe or religion may be.

As computers and information technology advance at an ever-increasing pace there exists the phenomenon of the creation of what I call a DIGITAL DIVIDE. By this I mean a rift developing between those people and organisations/companies that are on the digital side of the fence and those that are ignoring that world and continuing to lie on the other side – the old and dying side. My aim and raison d’être is to right that ill. If you have sensed that there is something happening out there and that you might be missing the bus you’ve come to the right place. And remember …

Nothing in this world runs without some form of technology behind it. If you are the kind of person that likes to know what makes our world tick, studying computers and computer science is the best and arguably the only way to learn what that is.

Do have a browse around (the links at left under the RECENT POSTS heading will serve you well) and if you have a question fire it off to dr.neal.aggarwal@gmail.com or reach out to me on Telegram at t.me/drnealaggarwal. My other contact details are on this page.

Some Covid Thoughts

6 to 12 minute read if you go through the whole thing. Less than 2 minutes to go through the bold text highlights.

Attribution: Erin S. Bromage, Ph.D. – I have taken a lot of Dr Bromage’s thoughts and added my own discussion and emphasis. I have also provided links to curated and peer reviewed sources of informationl. Click on the links to check for yourselves.

The Risks – Know Them – Avoid Them

An epidemic curve has a relatively predictable upslope and once the peak is reached, the back slope can also be predicted. Robust data from the outbreaks in China and Italy, show the back of the mortality curve declines slowly, with deaths persisting for months. The US has just crested in deaths at 70k, it is possible that they will lose another 70,000 people over the next 6 weeks as they come off that peak. That’s what’s going to happen with a lockdown not if they open up fully. If they do that … all bets are off. Here in Kenya we have 715 cases (Day 60) so we can expect at least that many more if we continue the lockdown. (MOH Presetation 19th May 2020)

As nations reopen the virus gets more fuel and then all bets are off. I understand the reasons for reopening the economy, but I’ve said before, if you don’t solve the biology, the economy won’t recover.

There are very few nations that have demonstrated a sustained decline in numbers of new infections. As of May 3rd the majority are still increasing and reopening. As a simple example of the USA trend, when you take out the data from New York and just look at the rest of the USA, daily case numbers are increasing. Bottom line: the only reason the total USA new case numbers look flat right now is because the New York City epidemic was so large and now it is being contained. (as of May 3rd)

Where are people getting sick?

We know most people get infected in their own home. A household member contracts the virus in the community and brings it into the house where sustained contact between household members leads to infection.

But where are people contracting the infection in the community? Grocery stores, bike rides, inconsiderate runners who are not wearing masks…. are these places of concern? Well, not really; explanation follows…

In order to get infected you need to get exposed to an infectious dose of the virus; based on infectious dose studies with other coronaviruses, it appears that only small doses may be needed for infection to take hold. Some experts estimate that as few as 1000 SARS-CoV2 infectious viral particles are all that will be needed (ref 1, ref 2). Please note, this still needs to be determined experimentally, but we can use that number to demonstrate how infection can occur. Infection could occur, through 1000 infectious viral particles you receive in one breath or from one eye-rub, or 100 viral particles inhaled with each breath over 10 breaths, or 10 viral particles with 100 breaths. Each of these situations can lead to an infection.

How much Virus is released into the environment?

A Bathroom: Bathrooms have a lot of high touch surfaces, door handles, faucets, stall doors. So fomite transfer risk in this environment can be high. We still do not know whether a person releases infectious material in feces or just fragmented virus, but we do know that toilet flushing does aerosolize many droplets. Treat public bathrooms with extra caution (surface and air), until we know more about the risk.

A Cough: A single cough releases about 3,000 droplets and droplets travels at 50 miles per hour. Most droplets are large, and fall quickly (gravity), but many do stay in the air and can travel across a room in a few seconds.

A Sneeze: A single sneeze releases about 30,000 droplets, with droplets traveling at up to 200 miles per hour. Most droplets are small and travel great distances (easily across a room).

If a person is infected, the droplets in a single cough or sneeze may contain as many as 200,000,000 (two hundred million) virus particles which can all be dispersed into the environment around them.

A breath: A single breath releases 50 – 5000 droplets. Most of these droplets are low velocity and fall to the ground quickly. There are even fewer droplets released through nose-breathing. Importantly, due to the lack of exhalation force with a breath, viral particles from the lower respiratory areas are not expelled.

Unlike sneezing and coughing which release huge amounts of viral material, the respiratory droplets released from breathing only contain low levels of virus. We don’t have a number for SARS-CoV2 yet, but we can use influenza as a guide. Studies have shown that a person infected with influenza can releases up to 33 infectious viral particles per minute.

If a person coughs or sneezes, those 200,000,000 viral particles go everywhere. Some virus hangs in the air, some falls into surfaces, most falls to the ground. So if you are face-to-face with a person, having a conversation, and that person sneezes or coughs straight at you, it’s pretty easy to see how it is possible to inhale 1,000 virus particles and become infected.

But even if that cough or sneeze was not directed at you, some infected droplets–the smallest of small–can hang in the air for a few minutes, filling every corner of a modest sized room with infectious viral particles. All you have to do is enter that room within a few minutes of the cough/sneeze and take a few breaths and you have potentially received enough virus to establish an infection.

But with general breathing, 33 viral particles minute into the environment, even if every virus ended up in your lungs (which is very unlikely), you would need 1000 viral particles divided by 20 per minute = 50 minutes.

Speaking increases the release of respiratory droplets about 10 fold; ~200 virus particles per minute. Again, assuming every virus is inhaled, it would take ~5 minutes of speaking face-to-face to receive the required dose.

The exposure to virus x time formula is the basis of contact tracing. Anyone you spend greater than 10 minutes with in a face-to-face situation is potentially infected. Anyone who shares a space with you (say an office) for an extended period is potentially infected. This is also why it is critical for people who are symptomatic to stay home. Your sneezes and your coughs expel so much virus that you can infect a whole room of people.

What is the role of asymptomatic people in spreading the virus?

Symptomatic people are not the only way the virus is shed. We know that at least 44% of all infections–and the majority of community-acquired transmissions–occur from people without any symptoms (asymptomatic or pre-symptomatic people). You can be shedding the virus into the environment for up to 5 days before symptoms begin.

Infectious people come in all ages, and they all shed different amounts of virus. The figure below shows that no matter your age (x-axis), you can have a little bit of virus or a lot of virus (y-axis). (ref)

The amount of virus released from an infected person changes over the course of infection and it is also different from person-to-person. Viral load generally builds up to the point where the person becomes symptomatic. So just prior to symptoms showing, you are releasing the most virus into the environment. Interestingly, the data shows that just 20% of infected people are responsible for 99% of viral load that could potentially be released into the environment (ref)

So now let’s get to the crux of it. Where are the personal dangers from reopening?

When you think of outbreak clusters, what are the big ones that come to mind? Most people would say cruise ships. But you would be wrong. Ship outbreaks, while concerning, don’t land in the top 50 outbreaks to date. Ignoring the terrible outbreaks in nursing homes, we find that the biggest outbreaks are in prisons, religious ceremonies, and workplaces. Any environment that is enclosed, with poor air circulation and high density of people, spells trouble.

Some of the biggest super-spreading events are:

  • Weddings, funerals, birthdays: 10% of early spreading events
  • Business networking: Face-to-face business networking like the Biogen Conference in Boston in late February.

Some actual scenarios that have unfolded

Restaurants: Some really great shoe-leather epidemiology demonstrated clearly the effect of a single asymptomatic carrier in a restaurant environment (see below). The infected person (A1) sat at a table and had dinner with 9 friends. Dinner took about 1 to 1.5 hours. During this meal, the asymptomatic carrier released low-levels of virus into the air from their breathing. Airflow (from the restaurant’s various airflow vents) was from right to left. Approximately 50% of the people at the infected person’s table became sick over the next 7 days. 75% of the people on the adjacent downwind table became infected. And even 2 of the 7 people on the upwind table were infected (believed to happen by turbulent airflow). No one at tables E or F became infected, they were out of the main airflow from the air conditioner on the right to the exhaust fan on the left of the room. (Ref)

Workplaces: Another great example is the outbreak in a call center (see below). A single infected employee came to work on the 11th floor of a building. That floor had 216 employees. Over the period of a week, 94 of those people became infected (43.5%: the blue chairs). 92 of those 94 people became sick (only 2 remained asymptomatic). Notice how one side of the office is primarily infected, while there are very few people infected on the other side. While exact number of people infected by respiratory droplets / respiratory exposure versus fomite transmission (door handles, shared water coolers, elevator buttons etc.) is unknown. It serves to highlight that being in an enclosed space, sharing the same air for a prolonged period increases your chances of exposure and infection. Another 3 people on other floors of the building were infected, but the authors were not able to trace the infection to the primary cluster on the 11th floor. Interestingly, even though there were considerable interaction between workers on different floors of the building in elevators and the lobby, the outbreak was mostly limited to a single floor (ref). This highlights the importance of exposure and time in the spreading of SARS-CoV2.

Choir: The community choir in Washington State. Even though people were aware of the virus and took steps to minimize transfer; e.g. they avoided the usual handshakes and hugs hello, people also brought their own music to avoid sharing, and socially distanced themselves during practice. They even went to the lengths to tell choir members prior to practice that anyone experiencing symptoms should stay home. A single asymptomatic carrier infected most of the people in attendance. The choir sang for 2 1/2 hours, inside an enclosed rehearsal hall which was roughly the size of a volleyball court.

Singing, to a greater degree than talking, aerosolizes respiratory droplets extraordinarily well. Deep-breathing while singing facilitated those respiratory droplets getting deep into the lungs. Two and half hours of exposure ensured that people were exposed to enough virus over a long enough period of time for infection to take place. Over a period of 4 days, 45 of the 60 choir members developed symptoms, 2 died. The youngest infected was 31, but they averaged 67 years old. (corrected link)

Indoor sports: A super spreading event occurred during a curling event in Canada. A curling event with 72 attendees became another hotspot for transmission. Curling brings contestants and teammates in close contact in a cool indoor environment, with heavy breathing for an extended period. This tournament resulted in 24 of the 72 people becoming infected. (ref)

Birthday parties / funerals: This is a real story from Chicago. The name is fake. Bob was infected but didn’t know. Bob shared a takeout meal, served from common serving dishes, with 2 family members. The dinner lasted 3 hours. The next day, Bob attended a funeral, hugging family members and others in attendance to express condolences. Within 4 days, both family members who shared the meal are sick. A third family member, who hugged Bob at the funeral became sick. But Bob wasn’t done. Bob attended a birthday party with 9 other people. They hugged and shared food at the 3 hour party. Seven of those people became ill. Over the next few days Bob became sick, he was hospitalized, ventilated, and died.

But Bob’s legacy lived on. Three of the people Bob infected at the birthday went to church, where they sang, passed the tithing dish etc. Members of that church became sick. In all, Bob was directly responsible for infecting 16 people between the ages of 5 and 86. Three of those 16 died.

The spread of the virus within the household and back out into the community through funerals, birthdays, and church gatherings is believed to be responsible for the broader transmission of COVID-19 in Chicago. (ref)

Commonality of outbreaks

The reason to highlight these different outbreaks is to show you the commonality of outbreaks of COVID-19. All these infection events were indoors, with people closely-spaced, with lots of talking, singing, or yelling. The main sources for infection are home, workplace, public transport, social gatherings, and restaurants. This accounts for 90% of all transmission events. In contrast, outbreaks spread from shopping appear to be responsible for a small percentage of traced infections. (Ref)

Importantly, of the countries performing contact tracing properly, only a single outbreak has been reported from an outdoor environment (less than 0.3% of traced infections). (ref)

Overriding thoughts

Indoor spaces, with limited air exchange or recycled air and lots of people, are concerning from a transmission standpoint. We know that 60 people in a volleyball court-sized room (choir) results in massive infections. Same situation with the restaurant and the call center. Social distancing guidelines don’t hold in indoor spaces where you spend a lot of time, as people on the opposite side of the room were infected.

The principle is viral exposure over an extended period of time. In all these cases, people were exposed to the virus in the air for a prolonged period (hours). Even if they were 50 feet away (choir or call center), even a low dose of the virus in the air reaching them, over a sustained period, was enough to cause infection and in some cases, death.

Social distancing rules are really to protect you with brief exposures or outdoor exposures. In these situations there is not enough time to achieve the infectious viral load when you are standing 6 feet apart or where wind and the infinite outdoor space for viral dilution reduces viral load. The effects of sunlight, heat, and humidity on viral survival, all serve to minimize the risk to everyone when outside.

When assessing the risk of infection (via respiration) at the grocery store or mall, you need to consider the volume of the air space (very large), the number of people (restricted), how long people are spending in the store (workers – all day; customers – an hour). Taken together, for a person shopping: the low density, high air volume of the store, along with the restricted time you spend in the store, means that the opportunity to receive an infectious dose is low. But, for the store worker, the extended time they spend in the store provides a greater opportunity to receive the infectious dose and therefore the job becomes more risky.

Basically, as the work closures are loosened, and we start to venture out more, possibly even resuming in-office activities, you need to look at your environment and make judgments. How many people are here, how much airflow is there around me, and how long will I be in this environment. If you are in an open floorplan office, you really need to critically assess the risk (volume, people, and airflow). If you are in a job that requires face-to-face talking or even worse, yelling, you need to assess the risk.

If you are sitting in a well ventilated space, with few people, the risk is low.

If I am outside, and I walk past someone, remember it is “dose and time” needed for infection. You would have to be in their airstream for 5+ minutes for a chance of infection. While joggers may be releasing more virus due to deep breathing, remember the exposure time is also less due to their speed. Please do maintain physical distance, but the risk of infection in these scenarios are low. Here is a great article in Vox that discusses the low risk of running and cycling in detail.

Don’t forget surfaces. Those infected respiratory droplets land somewhere. Wash your hands often and stop touching your face!

As we are allowed to move around our communities more freely and be in contact with more people in more places more regularly, the risks to ourselves and our family are significant. Even if you are gung-ho for reopening and resuming business as usual, do your part and wear a mask to reduce what you release into the environment. It will help everyone, including your own business.

COVID-19 Superspreader Events in 28 Countries: Critical Patterns and Lessons

Key Ideas

So one of the key ideas I’ve had and put to use teaching over 3000 students from general computing all the way to Artificial Intelligence skills has been is to give them bite sized bits of information about advanced and fairly arcane and difficult to understand technologies in a unique hands-on manner using what I call the ‘inverted classroom.’ Read on to learn more about this.

When presented in the usual way of lectures, slides and books these technologies are so hard to learn that the task is almost insurmountable. My approach however has worked incredibly well and I’ve graduated more than 3,000 students to date in what to many seemed impossible to learn before they went through one of my courses.

I invite you to join me and share in this big, bright, new world that has so much fascinating stuff going on in it. For a while now I’ve been [forced] to focus on patients with Type II Diabetes putting them onto my special version of the ketogenic diet and curing close to 99.99% of the 2260 patients (only 2 failed treatments in the entire lot over the past 17 years that I’ve been doing this and those also only because those two patients ‘cheated’ and didn’t follow the protocol I laid out for them). But it has become only too clear to me over these past few months that this is not a sustainable model. I am only one person and dealing with more than 2000 currently active patients has been taxing to say the least. Also it’s begun to feel like I’m wasting too much of a good thing to be limited to so few cures borne of the constraint of <one person:one patient> at a time. What if there were 10 of me? What if 1000? Would the results not scale up to 2,000,000 cures? Would that then not be a legacy worth leaving behind?

It took a while but I’ve finally arrived at the thought that I must remodel these ideas of mine and try to get that 3 orders of magnitude (1000X) increase in my output by creating as many copies of me as possible. But how can I do that? I’ve spent more than 30 years researching this nutrition topic; I’ve spent ALL my adult life working/playing/living with computers, code and algorithms. From the 2 year stint I spent living with the Masai from 1984 to 1986, collecting nutrition data, putting it into a masters level thesis, subjecting that study to rigorous peer review, having holes bored into it, repeating it with corrections … this knowledge emerged. Is it possible to teach the same to a new cadre of medical professionals so they can go out and achieve the same or similar results as mine without the 30-year lead time?

And what about applying these same AI, Deep Learning technologies to other fields. I did and continue to do it in finance. And I’m doing it in the cryptocurrency world right now even as I write this. I am financially free (and able to spend time writing this blog) because I know these technologies and have figured out how to put them to use for myself. I find myself on the right side of the digital divide. Can this be taught to the general public? I taught it to a bunch of friends some doctors, some who’d never seen algorithms before. What about the general public?

YOU will be the judge of that. Get ready to join the 3000+ that have already capitalised on my offerings and seen their lives take off to places they’d never thought they’d reach.

A little Hx

Hx is the doctors standard shorthand way of writing ‘history.’ Part and parcel of the cloak and dagger stuff the medical profession are so wont to push ‘out there’ onto all and sundry. I have to admit though that even I was prone to this ‘holier than thou’ attitude especially way back when computers were such enigmatic entities understood by only a few of us ‘wizards.’ But all that has changed now with patients no longer accepting what their doctors tell them until they themselves have googled the terms, protocols, diagnoses and prognoses and satisfied themselves as to the veracity of what they’re being told. And that attitude now pervades ALL of life with so many people wanting to figure out for themselves what is going on and how to assure themselves that whatever it is really is happening to them and the cause of it.

So, if we go back a ways we come to 1977 and a 17-year-old me reading an article in Popular Science magazine. The article announced the arrival of a cheap computer that one could buy in kit form, assemble it at home and plug into the living room TV, in the process getting a prompt and being able to start programming in a language called Sinclair BASIC. I was beside myself. I could hardly sleep and yet I knew I’d not be able to buy that machine as I had nowhere near the £50 being asked for it. I read the article over and over again until the magazine, tattered and torn, finally gave up the ghost and had to be consigned to the rubbish bin.

So, there was no other way out … I had to put my nose to the grindstone learn by reading and experimentation and build a computer from the ground up. Fortunately I knew a lot about electronics (it being a hobby of mine before the advent of computers) and though what I didn’t know caused my fledgling computer to crash and burn I did not give up, I just headed back to the drawing board and started over. Tenacity ruled the days and weeks and finally I had a computer that could add a 1 and a zero together!

Curiously, many years later Ben Eater created this playlist on YouTube and it very closely mimics the process I went through. Ben though is an expert while I burned and crashed a lot emerging out the other end of the tunnel after much heart and head-ache but much learning too. And in the end I had that machine that I’d always wanted … a computer!

Then a few years later I went on to found two IT colleges one of them the first internet based training institute in Kenya and the first to run what we now call a MOOC. I taught hundreds of students what I know in this very special way you are about to find.


The World a-Changin’

When I built my first computer in 1977 and then went on to medical school and found a mainframe in the basement of my college putting it together piece by piece, I had to program the things myself using Assembly Language and it was pain, pain, PAIN! Then the computer world progressed to the C programming language and things got a lot better. And yet still — this was not for the faint-hearted and was reserved for a small minority of us. Setting up the tools was a pain, several blind alleys awaited the unwary. You’d often find yourself needing to compile a program and that was daunting enough. If you needed to compile a compiler and yet needed a compiler before you could compile at all you found yourself in a chicken and egg trap with nowhere to go.

Fortunately that has all changed and today we have tools that are INCREDIBLE. You have no idea what you’ve been missing. There are tools ‘out there’ that are FREE, easy to install, easy to get up and running, don’t care whether you’re on Windows, Linux or macOS, and that almost disappear from your view so that you can concentrate on the ideas in your mind.

In my training programs I use the programming language Python. It’s not a new language but it is THE MOST POPULAR language with hundreds of thousands of tools/libraries available for it.

source: https://pypi.org/

That’s close to 170,000 libraries of tools you can use right off the bat and can be up and running with a software tool you’ve written in a matter of minutes. Within a day or so you could have a complete app that you’re running on your phone and within a month a trading system you probably don’t even know to dream off as you read this now.

Seriously folks – install the tools I’ll guide you to install (very easy half hour job that consists mostly of downloading files from the internet and clicking on buttons and following very clear instructions on screen) – and I’ll have you writing your own Python code and executing it right here in your browser in 2 minutes flat!

In fact this is so easy I’m going to walk you through it right now and within 30 minutes (if you have a decent internet connection) you’ll be up and running with Python and a Jupyter Notebook.

  • Click here to go to the Anaconda website.
  • Scroll down till you see a green button that says download as in the image below:

Make sure that the platform you are installing for is the one you’re using – Windows, Mac or Linux and that you’re downloading the 3.x version not 2.x.

  • Click the green button and wait for the file to download to your computer.
  • Double-click the file you downloaded and follow the onscreen prompts to install the Anaconda python distribution on your computer.
  • Open a terminal or command line and type python -V (don’t know how to do that? Put a question in the comments on this page or reach out to me). You should get back a message that tells you what version of Python you now have installed. Make sure it’s Python 3.something not 2.x. I am no longer supporting Python version 2.
this is what my computer tells me I have installed. Yours might be a newer version which is fine.

Now at the same terminal/command line type: jupyter notebook as I’ve done below:

Your browser will open a new tab with a listing of files in it. At the top right click <New> and choose Python 3 as in the image below and …

Voila! You have the Python 3 programming environment at your fingertips ready to do your every bidding. The cell you see with the green box around it is waiting for you to write your first command. Here … I’m going to do it …

I’ve entered print(“Hello World!”) and then pressed <control+enter> and this is what I got.

Did you get this? If so HOORAY – you’ve run your first Python program.

Time from start to finish should not have been more than 30 minutes if you have a decent internet connection and did not make any mistakes or type any typos or miss any steps in the process I’ve described above. Are you a programmer yet? Hell no! That will take a while. Just as I can teach you all the chess moves in a morning you will not be a chess grandmaster by afternoon. Just as becoming a grand master takes practice and much thought and study so too does programming take the same time and skills development. But you have a programming environment that is so, so easy to use, easy to tinker with, easy to put small bits of code into to see what they do when you execute them.

So what I’m here for now is to help you get the skills into your head and fingers that will make you a programmer interacting with a computer in ways you never thought possible, making the computer into an intelligent partner in your life.

Generally the process is very similar to the installation of Python I’ve walked you through above. It goes smoothly and the tools are so transparent now that we can focus on the ideas and not on the tools any more. We can think about how to tweak algorithms so they make money for us or work on new and novel ways of looking at our business or company data to extract new insights and meanings out of it. Or we can download data sets and study them to figure out what we should be eating based on hundreds or even thousands of pieces of data that we can tap into to figure out what the real deal is all about. Or we can find a data set to analyse and enter a contest in which we can win several thousand dollars or even a million dollars. The choice is ours. We no longer have to walk alone in the dark believing what all and sundry tell us. We can figure it out for ourselves wring out the proofs and go our merry way knowing that we know!

But you’ve begun your journey and it’s going to be a very exciting one.

Welcome aboard!

Ya only got yer feet wet

Now you’ve got your feet wet and Python installed and tested follow me on this journey to prove to yourself how easy, fast and fun it is to analyse data with python.

We’re going to be working with the PIMA Indians diabetes data. This is an open data source that you can get from the UCI Machine Learning Repository and is downloaded directly as part of each little program we’re going to run below to amaze ourselves with what python can do. But you don’t have to search for the file on the UCI server. I’ve provided it for you here. Right click THIS LINK and save the file to a directory on your computer. I suggest you create a directory named DNA-Python and save it there. We’ll use this directory in future classes so it’ll be useful to create it right now.

Once you’ve downloaded it open it with a spreadsheet application. In most cases all you have to do is right click the file and choose <open with> and choose your spreadhseet app – Microsoft Excel, Openoffice, Numbers – whatever you have on your machine. If you don’t have a spreadsheet app open it with a text editor – Notepad, GEdit, TextEdit, Sublime, Atom, Emacs, Vim – whatever you like.

Here is the file open in various apps on my machine:

Here it is open in TextEdit
And here is Openoffice asking me what kind of parameters I want to open it with …

And here it is open on Openoffice.

Now you have the file open scroll through and have a good look at it. There are 9 columns A to I across the top and 768 rows. Those are 9 measures of health for 768 patients.

What else can you say about this data. Can you tell what the columns are depicting? So how do we find out more about this dataset to decide if it interests us and to decide if we can use this dataset to learn something about data science.

Let’s try a google search. I put <pima Indians diabetes dataset> into google and found that it is on Kaggle and I can source it through my free account on data.world so I headed there and this is what I got >>

So, a little bit more information about the dataset. Getting somewhere. If I scroll around and click on various links just browsing, playing with the tools, gleaning a little bit here and there about this dataset I find this >>

Aha! Here are the 9 columns. So now I have some idea what I can call my columns as I analyse the data with python.

Now here is an important thing for you to do to follow along with me and learn some data science. Fire up a Jupyter notebook. Forgot how to? Go back and read this article once more then come back here and start a jupyter notebook. Make sure your you’re in cell 1 of your python 3 notebook and TYPE in all the code you see below. Don’t try to understand any of it right now, just type it in. My goal here is to show you what python can do not to teach you python the language. We’ll get to that later. So in cell 1 type the code you see below and press control-enter when you’re done. There is one stumbling block here and that is figuring out what the path to the PIMA file you downloaded is. I can’t figure out what your path should be and this is one of the steps you have to ferret out yourself. So pull out that detectives hat Sherlock and get it to run. Here’s what it looks like on my machine >>

So some of you are complaining that you got an error message stating something like matplotlib not found or pandas not found. If you do this is what you have to do. Open up a terminal (or command prompt in windows) and type this >>

or this …

When I do this on my machine I get this >>

Which is a good thing telling me I already have these libraries installed. Remember libraries that we talked about on this page? These are only two of them we’re using here and they’re giving us the ability to use those functions that you typed in earlier.

OK. Go back to the cell you keyed in. The one that looks like this >>

And press control-enter again. Did you get this …

If you did CONGRATULATIONS! You’ve done your first analysis of a dataset using python. Now we can look at the plots and try to figure out what our data is telling us. I don’t want to get into the data science part of this course just yet. At this point in our explorations I just want you to see what an amazing tool python is.

Make sure you get this working. Make sure you KEY IN ALL the code I’ve listed above. C’mon it’s only 7 lines of code. KEY IT IN and press control-enter. Get it to work. This is the grunt work of a data scientist working in python. Not too hard is it? A little more knowledge under your belt and your bosses or partners will be calling you a genius.

Ready for the journey of a lifetime? Read on or reach out to me and let’s talk about what heights you want to take your life to.

Now in the same intuitive way I’m going to lead you through a neural network. Yes! You read that right. We’re going into the big leagues and building a neural network right there in your Jupyter Notebook! Click here to visit that page. It’s a work on progress so you should come back often to keep up with the updates I’ll be making over the next few days.

The magic of neural networks

Neural networks are my favourite AI application which is great for you as they’re the most important development in computers since the internet! And I’m about to give you the keys to the neural net kingdom. In a just a few short steps you’re going to build a neural network. I’m going to explain neural networks to you in the same manner I do all my training — by hands-on doing it!

Key this line into your notebook and execute it by pressing <shift+enter> so you get the cursor to drop down to the next cell ready for you to execute the next line.

Did it work or did you get an error? If you got an error like — numpy not found – you need to install numpy. Go back to this page and read how to do that. Don’t get frustrated this early in the game. Be patient; go back and read that page and get it to work. Take breaks if you feel you need to. You need to let your brain internalise all this new information that is coming its way.

Numpy is a very useful library. Click on this link here to have a tab open in your browser with info on numpy in it. Just have a quick look around; don’t bother your head about it too much. You’re going to learn all about numpy in days to come but not in the usual manner of lectures and boring lecturers standing in front of you trying to drum in lines of code but with me teaching you you’re going to learn the HOW of using such tools as numpy and you’re going to start using them right away and give yourself a framework on which to hang the knowledge of these tools as and when you need it — I call it just in time learning!. Then you’re going to learn as you use it those parts that you need and also learn how to get information on those things just at the edge of what you’re using so that you don’t miss out on any functionality that numpy (or any other library you use) can give you to improve your code and your day.

OK. Numpy is working. Now key this in.

… and again press <shift+enter> and the cursor should drop into the empty cell that opens below this one with no error reported. (Once again: if you do get an error please create a comment to this post so we call all jump to your rescue and everyone can learn as a result of your feedback).

What are we doing here? The line which starts with a # symbol is a comment. You’re telling the python interpreter that this line is not code to be executed. This is a comment to yourself; something you’ve put here to remind yourself or explain to yourself what you were thinking as you wrote this. If you come back to this code days or weeks later this comment will jog your memory. As a comment it has no effect on the interpreter and does nothing as code might be expected to do.

Next is the line that begins with def. def is a python keyword, a word reserved for pythons use that tells the interpreter that you are about to define a function. Here the function you’re defining you’ve given the name nonlinear and it takes two parameters x and deriv. deriv is set to False as soon as you go into the function but x has no value. Now read the rest of the function and ask me questions in class or in the comments section below. I can guess that x times 1 minus x won’t give you too much to worry about but what is np.exp?

Remember when we asked python to import numpy and let us use it as np? Well that’s exactly what we’re doing. We’re saying to the interpreter give us the function in numpy that is called exp. We do that by saying np.exp (np dot exp) and we want to give the exp function the value -x (the negative value of x) so it can do its magic on x.

Go to this link on the numpy website now and browse through the list. Take your time. In the search box at the right margin (as in the image I’ve capture below) …

… type in exp and click the <search> button and click on the numpy.exp link – the top or second one will do, it does not matter which one you click on. This is what you’ll get …

Look at the line that says: Calculate the exponential of all elements in the input array. I know: Theres a lot to parse here. What is an exponential? What is an array?Take it slowly. At first don’t worry at all, just key in the code and make sure it works.

This is our “nonlinearity”. There are several kinds of functions we could have used here. The one we’re using is a nonlinearity that maps a function called a “sigmoid”. A sigmoid function maps any value to a value between 0 and 1. We use it here to convert numbers to probabilities. It also has several other desirable properties for training neural networks.

Now key in …

# input dataset
X = np.array([  [0,0,1],
                [1,1,1] ])

… and again press <shift+enter>. (from now on I won’t ask you to execute the code. I’ll just assume that you’ll enter and execute it. Remember that you should KEY IT IN and not just cut and paste. Keying in the code has real value in getting your brain to process what you’re doing at a much deeper level than just watching stuff happen like you’re watching a movie).

We’ll talk more about what that code is doing but it’s quite easy … we’re creating an array which will become the data we’re going to input into our neural network to train the network to make predictions.

# output dataset            
y = np.array([[0,0,1,1]]).T

This is the result we expect the neural net to come up with.

# seed random numbers to make calculation
# deterministic (just a good practice)

This is some good practice. We’d get away with not executing this code but it’s just good practice to seed our random generators. Again: We’ll talk much more in class about random number generators. You can google the term if you’re too eager to wait for class.

# initialize weights randomly with mean 0
synapse0 = 2*np.random.random((3,1)) - 1

This is our weight matrix for this neural network. I called it “synapse0” as it feels to me like a synapse – a meeting point – between two biological nerve cells (neurons). Since we only have 2 layers (input and output), we only need one matrix of weights to connect them. Its dimension is (3,1) because we have 3 inputs and 1 output. Another way of looking at it is that layer0 is of size 3 and layer1 is of size 1. Thus, we want to connect every node in layer0 to every node in layer1, which requires a matrix of dimensionality (3,1). 

Also notice that it is initialized randomly with a mean of zero. There is quite a bit of theory that goes into weight initialization. For now, just take it as a best practice that it’s a good idea to have a mean of zero in weight initialization. 

Another note is that the “neural network” is really just this matrix. We have “layers” layer0 and layer1 but they are transient values based on the dataset. We don’t save them. All of the learning is stored in the synapse0 matrix.

for iter in range(10000):

    # forward propagation
    layer0 = X
    layer1 = nonlinear(np.dot(layer0,synapse0))

    # how much did we miss?
    layer1_error = y - layer1

    # multiply how much we missed by the 
    # slope of the sigmoid at the values in layer1
    layer1_delta = layer1_error * nonlinear(layer1,True)

    # update weights
    synapse0 += np.dot(layer0.T,layer1_delta)

This is the meat of our neural network. Read through the comments first, don’t worry about the code. After you’ve read the comments and got a feel for where we’re going with this go back and read through the code line by line.

Can you now look up on the internet what np.dot is about? Can you see how we’re calling the function nonlinear() that we defined earlier? Can you see how we find the error and store it in layer1_error and push that back in our nonlinear() function storing the result in another variable layer1_delta?

Hey you ‘sneaked in’ the word variable. Yes I did and those are the kinds of tricks I have up my sleeve. Instead of telling you that python as something called variables which are places in memory where you can store a value for use later (yada, yada, yada) I sneak things in as we go and explain them on-the-fly so you REALLY get to see those things in action and thoroughly internalise your learning.

print("Output After Training:")

So finally let’s get some output to see how well our neural net did.

… and this is what I get when I execute that last cell. The result >> 0, 0, 1, 1 – EXACTLY what we wanted to see!

Here’s all the code in one place:

import numpy as np

# sigmoid function
def nonlinear(x,deriv=False):
        return x*(1-x)
    return 1/(1+np.exp(-x))
# input dataset
X = np.array([  [0,0,1],
                [1,1,1] ])
# output dataset            
y = np.array([[0,0,1,1]]).T

# seed random numbers to make calculation
# deterministic (just a good practice)

# initialise weights randomly with mean 0
synapse0 = 2*np.random.random((3,1)) - 1

for iter in range(10000):

    # forward propagation
    layer0 = X
    layer1 = nonlinear(np.dot(layer0,synapse0))

    # how much did we miss?
    layer1_error = y - layer1

    # multiply how much we missed by the 
    # slope of the sigmoid at the values in layer1
    layer1_delta = layer1_error * nonlinear(layer1,True)

    # update weights
    synapse0 += np.dot(layer0.T,layer1_delta)

print("Output After Training:")

Onward me hearties

So you’ve got this far installed a programming environment (python and Jupyter Notebooks) done a little data science and a little neural network programming.

You could of course go at it your own – get a few books, wade through them and teach yourself this stuff. It’s possible and it’s been done before. Elsewhere on this website I’ve recommended Wes McKinney’s book for data science and Aurelion Geron’s one for Machine Learning. And then of course there’s Seb Raschka’s wonderful, wonderful books to dig deeper still.

Or you could watch some YouTube videos or take some coursera, udemy, codeacademy or edx videos and try the code out and progress one video/course at a time.

Yes it can be done but it’s going to be tough going, a long slog and you’re going to get stuck from time to time and will need help to get unstuck. That’s where I come in. With me ‘at the front of your class’ so to speak we’d employ my inverted classroom technique and you’d choose a data set, even one in your own company/business or a chapter/paragraph in a book and try keying in and executing some code and then ‘come to class’ to ask questions, play with the code in the presence of your peers, learn from each other, test out new ideas and really, REALLY polish up your skills, internalise and consolidate the knowledge gleaned and become a true expert in the field.

Or you could choose to try for a million dollars in a Kaggle competition or to analyse some public data sets…

Will it take the 10,000 hours that Gladwell says it will? HELL NO! I can literally have you up and programming in as little as 30 minutes even if you’ve never programmed before. Do you stand a chance of winning that Kaggle prize? Yes of course you do. Will you get stuck along the way and will I be there to help you? Yes I will.

If you’ve gone through the pages in the list under my <recent posts> category title at the left of this page (or scrolled down sequentially if your on a mobile phone) and read it all doing the exercises as you go you’re well on your way to becoming a data scientist, AI practitioner. Was it hard to do? Did it take thousands of hours to learn and to perfect it? Of course not. All it took was the willingness to dip your toes in the water, install the environment (which boiled down to just clicking on green buttons on screen), key in a few words, even if you could not understand head or toe of what they meant, and then hit <control+enter> and your code was running.

What more could you ask for? How much easier than this can it become?

And if it’s that easy … what is your excuse for not being able to develop algorithms that trade money for you and make money automagically, without emotion, 24/7/365 or to run algorithms that show you beyond a shadow of any doubt how to live your life for the greatest health-span possible or to figure out what is best for your business, patients or customers in order to grow your business in leaps and bounds. OR — all of these at the same time as I do?

Stop whining about not being able to get on in the world and make a HUGE success of your life. There has never been a better time than now to take advantage of the massive leaps in technology that are presenting all of us with so many opportunities it’s becoming an embarrassment of riches. Wake up! Get off that couch. Create a windfall … do it TODAY!