Posts Tagged ‘data’

What kind of food does Leeds eat?

Wednesday, December 23rd, 2015 | Food

Following on from my previous post looking at statistics we can pull out from the Leeds Restaurant Guide dataset, I wanted to look at how the restaurant scene has changed since we first published the guide.

Here it is:

chart_cuisine_per_edition

In this graph, I have plotted each cuisine type against the number of restaurants. This is shown for the 1st edition (2013), 3rd edition (2014) and 5th edition (2015). As we learned in the last post, the number of restaurants has risen, so in general, we would expect most categories to have grown between each addition. I have not included pub grub as the size of it makes the rest of the data difficult to see.

For the most part, this holds true. Some cuisines have grown faster than others though. We have seen a rise in restaurants serving American, British, International (those that serve food from all over the world with no real speciality) and steak.

In other areas we have seen a decline though. Buffet, French, Indian and seafood have all seen a decline. Persian has too, but this was always a small market. The biggest change is possibly Chinese restaurants. In the first edition we had seven Chinese restaurants, now we have only four.

In terms of the most popular cuisines, Italian remains king. When we first wrote the guide we even considered splitting Italian into two categories, one for general Italian and one for restaurants that specifically did pizza. Latin is also very popular thanks to the growth of tapas bars. It used to be equally as popular as Indian, but Indian has since fallen away.

We can draw the most popular cuisines in a table. I have omitted hotels and casinos, and international because these do not really tell us anything about people’s tastes.

Position 2013 2015
1 Italian Italian
2 Latin Latin
3 Indian British
4 British American
5 American Indian

It is a pretty consistent story. The only change is that Indian has dropped off from a joint-second spot in 2013 to now being 5th, behind British and American. Much of the growth in these categories is down to meat places such as burgers and BBQ so it could be people are looking towards more meat-heavily dishes in recent years. Or it could also just be random chance. The sample size is not that big after all.

Dataclysm

Saturday, December 19th, 2015 | Books

Christian Rudder is a founder and head of data trends at the dating site OkCupid. For years he ran the blog OkTrends which looked at what data you could mine from their site. This book is a continuation of this work as well as bringing in other data sets, mostly to talk about human sexuality.

The full title is Dataclysm: Who We Are (When We Think No One’s Looking).

The anonymised data of OkCupid in aggregate provide some surprising facts, and some expected ones. Take gender differences, for example. Women rate men of a similar age to themselves as the most attract. Up until 30 women will rate men a year or two older than them as the most attractive; after 30 they find men a year or two younger than them most attractive. A drop off starts at 40. That is a good innings though. Compare this to the way men rate women. They rate 21 year olds the most attractive and it goes down hill from there.

He looks at the use of English on Twitter. Many people suppose the internet is degrading the quality of language used. Not so. The average length of a word used on Twitter is actually longer than that in professional publications, and historically. It turns out that when you limit people to 140 characters, they write concisely using a wide lexicon.

He quotes Steve Jobs: “people don’t know what they want until you show it to them”. This always reminds me of the Henry Ford quote “if I had asked my customers what they wanted, they would have said a faster horse”. Whether Ford actually said that is unknown, but it makes a good point. When asking for feedback you really need to find out what they think the problem is that you want to solve, rather than asking them what they think the solution is. In Ford’s case a faster way to get from A to B and in Job’s case an easier way to play and listen to music.

Back on OkCupid, it turns out that everyone is a racist. Rudder breaks the data down into how four groups: white, asian, latino, black, rate each other’s photos. It turns out that people generally rate their own race as the most attractive, but the real drop off is for black women by any other group, who consistently rate them lower. This has geographic differences however. There is a big gap in the US for example, while almost no gap in the UK.

He also looks at the differences between the heterosexual and LGBT communities. Is sexuality a spectrum, for example. Only 19% self-identifying bisexuals regularly message both males and females. This could imply a number of things. It could be that there is a spectrum and many bisexuals fall at either end of it. It could also be that some gay people identify as bisexual for cultural or social reasons. Especially given it correlates with their state’s tolerance of homosexuality. The answer is probably a number of different factors.

Rudder also mentions that Justine Sacco, the woman who made the “hope I don’t get aids” tweet, worked for OkCupid’s parent company. Sacco was discussed in Jon Ronson’s book So You’ve been Publicly Shamed. The hashtag #HasJustineLandedYet is a classic example of how quickly things can travel world the world these days.

In summary, it’s not too clear what Dataclysm was actually about. It seemed to be mostly “here is some interesting data about people”. From that respect, it was genuinely interesting. It also had a lot of crossover with A Billion Wicked Thoughts in using anonymous internet data, a source that has only come around in the last few decades, to reveal fascinating insights into human thoughts and behaviour.

dataclysm

Technical difficulties

Monday, October 13th, 2008 | Tech

I came to access the data partion of my hard drive yesterday and received a message informing me that my E: drive was not formatted and would I like to format it now?

Awwww, crap.

Having tried all the usual solutions I installed Partition Table Magic (having shelled out quite a bit for it) to fix everything. Which it didn’t. It did however mess with my partion table so that instead of booting off my main partition it now booted off the small media partition that you can use to play movies and such without booting up the full operating system.

So now I couldn’t get Windows to boot at all.

I downloaded GParted and booted into that with the LiveCD and managed to reset the flag so that it booted off the proper partition. But now the media center partition is visible for the first time and who knows what other damage has been done.

Still, I’ve got past the heart wrenching hours where I thought I was going to have to shell out for professional data recovery so things could be worse.

SQL UPDATE command

Sunday, September 16th, 2007 | Life, Tech

The SQL update command allows you to make specific changes to one or more rows. You could in theory do this by deleting the old one and inserting a new one but this means you have to deal with all the data and can only do one row of data. Using UPDATE solves both of these problems.

Basic syntax

As with everything there is a basic command structure to allow you to sort out what you want and run the various different parts of the commands. The syntax for using UPDATE is as follows:

UPDATE table SET field1 = 'value1', field2 = somenumber WHERE field3 = 'value3'

There are three parts to this. The first is the table to update. The second is the values which you are setting. The third is the validation of which rows in the table should have the update run.

Sample table

To make this easier to do I am going to create a sample table of data from which we can work from. The table is going to be called friends and will have several bits of data in it.

Name	email	age	favourite colour
Jim	jim@mail.com	24	blue
Alex	alex@operamail.com	23	black
Josh	happy@killer.com	30	grey
Mike	mikebob@avril.com	27	blue

Ok first lets say that Josh tells me that his email has changed from happy@killer.com to josh@wiggam.com. So we need to run an update command to change the email address from the old one to the new one.

UPDATE friends SET email = 'josh@wiggam.com' WHERE name = 'Josh'

In this code, all rows where the name is Josh, will have the current email changed to match josh@wiggam.com. This would be a problem if I had two Josh’s but I don’t so only one is updated. If I did have two Josh’s I could change the statement to say:

WHERE email = 'happy@killer.com'

Or I could also try

WHERE name = 'Josh' AND age = 30

Updating multiple values

Next let’s pretend that it’s Alex’s birthday – he turns 24 and decides that his favourite colour is now green. So we need to update two values in the table at once. Luckily this is pretty simple to do.

UPDATE friends SET age = 24, favourite colour = 'green' WHERE name = 'Alex'

First thing I must stress – you should never have a column name with a space it – don’t do it. But seen as this is just an example it’s not a problem. If you really needed a name like that then using FAVOURITE_COLOUR or a dash, etc to space them out.

In the above update statement both age and favourite colour are updated in every row where the name is Alex. Also there are no hyphens around age as it’s a number and so does not require them. You will get an error if you try to put them round a number when using a number column, as you will do for not using them in a text column.

Updating multiple rows

Finally I am going to look at how to update two records at a time. Though we have really already covered it so this is more of a confirmation to make sure you have it right. Take a look at this code:

UPDATE friends SET age = 60 WHERE favourite colour = 'blue'

I decide that everyone who likes blue is an old foogy so I am going to update everyone’s age to 60 who said their favourite colour was blue. As usual the script looks for everyone who is ok by the WHERE validation. This time it finds both Jim and Mike and changes both their ages to 60.

Conclusion

The UPDATE command is fairly simple – you just specify the table to update, the values to set and the conditions a row must meet for the update to be run on that row. One final thought though – take another look at Mike’s email address and try and guess what music I was listening to when I wrote that one :).

Yep, it was of course Feeder ;).

SQL SELECT command

Sunday, September 16th, 2007 | Life, Tech

Whether you connect to MySQL, MS Access, SQL or more databases and whether you do it via PHP, ASP or even more you still use the same basic commands of SQL to extract information from the database.

Below is a sample database table called “members”

Username	Name	Email	Posts
Jim	Jim Harris	jim@ntlworld.com	24
Mod	Toby Hunter	t.hunter@btopenworld.com	4
Happy	Simon Gates	theman2003@aol.com	12

Seen as your database connections and how you phrase the code depends on what language you are using I won’t cover that. Chances are you will have some variable such as db = “your SQL commands here”

First of all, you may want to extract all the data. A basic command would be:

SELECT * FROM members

The SQL commands such as SELECT and FROM are always in capitals. SELECT is the first thing you put whenever you are reading or extracting data. It tells you what to take out. For instance you may only want certain columns.

SELECT name FROM members

Or

SELECT name,email FROM members

* tells the script to take all the columns from the database.

FROM tells you what table in the database to take the data from. So for instance if you had a table called “stats” you would use:

SELECT * FROM stats

Next you may want to add conditionals onto the data to only take certain rows:

SELECT * FROM members WHERE name = "Jim Harris"

This would give you the result:

Jim Jim Harris jim@ntlworld.com 24

You can also add several conditional values on:

SELECT * FROM members WHERE username = "jim" and name = "Jim Harris"

That would produce:

Jim Jim Harris jim@ntlworld.com 24

As with most mathematical type equations you don’t have to use the = sign all the time. For instance:

SELECT * FROM members WHERE posts > 10

This would produce

Jim Jim Harris jim@ntlworld.com 24
Happy Simon Gates theman2003@aol.com 12

The guy with the username “mod” would be missed out as his has not made more than 10 posts.

Finally you can also order the rows. For instance if you wanted to order then by the number of posts they made:

SELECT * FROM members ORDER BY posts DESC

This would produce:

Jim Jim Harris jim@ntlworld.com 24
Happy Simon Gates theman2003@aol.com 12
Mod Toby Hunter t.hunter@btopenworld.com 4

By contrast:

SELECT * FROM members ORDER BY posts ASC

Would produce:

Mod Toby Hunter t.hunter@btopenworld.com 4
Happy Simon Gates theman2003@aol.com 12
Jim Jim Harris jim@ntlworld.com 24

ASC sorted them lowest first or alphabetically and DESC sorts them highest first or reverse alphabetically.

Adding data to the registry in Visual Basic

Thursday, December 30th, 2004 | Programming, Tech

This sample shows a bottom which when clicked changes the registry. The first three bits of information are the categorises in the registry while the final one is the value.

Private Sub button1_Click()
SaveSetting "My App", "Options", "Clicked", "Yes"
End Sub

You can also make the information dynamic. For instance if you wanted to save a users name you could have a text box called Name1. When the clicked a button it would then save the name they entered into the registry.

Private Sub button1_Click()
SaveSetting "My App", "Users", "Username", Name1.Text
End Sub

Reading from the Registry in Visual Basic

Thursday, December 30th, 2004 | Programming, Tech

In a previous article I showed you how to save settings in the registry. Now I’m going to show you how to get them back out again.

The basic syntax is as follows.

GetSetting("MyApp", "Category", "InfoName", "DefaultValue")

So lets look at the article on saving information in the registry. I used this example to save a username.

SaveSetting "My App", "Users", "Username", Name1.Text

Now, we’re going to pull this information out and and display the username in the title of the application. This code goes in the Load sub in your application. The name of the form we are working on is irrelevant because I used the term “me” which refers to the current form the code is on.

Me.Caption = "Weclcome " + GetSetting("MyApp", "Users", "Username", "To MyApp")

If no value is found in the registery, the default value is used so if no username was present, the applications title would be.

Welcome To MyApp

If the username in the register was “Jimbo” the applications title would be.

Welcome Jimbo

When using GetSettings you can’t use it on its own. You must use as as part of an equasion such as the example below.

A = GetSettings

You can also do the following.

If GetSettings = a Then do b

It simply supplies one piece of information from the Registry just like a variable.