Most coolest tools

Part 1 (2017-01-31) - querying StackOverflow question data

I was curious what was new under the sun, and what are people working with.

I then found that StackOverflow lets you query their data and forked an existing query for the most popular question tags in a given time frame compared to the total.

Then, I created a "Freshness" ratio - the percent of total questions about a tag that were in the last year (newer than 2017-01-31).

Sorting by that ratio, I got obscure tags such as "asp.net-core-2.0" and "laravel-5.4" - which are specific tools launched that year. This did not please me.

Therefore, I added another factor to the formula: the count of questions asked last year. This gave me the results I wanted - the "asp.net-core" tag was above "asp.net-core-2.0", and "laravel" was above "laravel-5.4". However, this introduces a bias towards the status quo which I thought was too much.

Then, I replaced the count of last year's questions with its square root (after seeing that the logarithm was too penalizing). Looking at the outcome, I think this is reflective of what skills one might want to learn. Sadly however, "laravel-5.4" is still above "laravel", but at least the "asp.net-core-2.0" is below its base tag, "asp.net-core".

Without further ado, here are the results:

I was flabbergasted that Python grew so much lately- 24.1% of all questions were published in just the last year. I had no idea.

The dataset is available here, in case you want to copy/pasta it and sort it according to your own formula. If you do that, I'd appreciate if you shared your formula, so I can also learn something :)

I can't wait to dig into these technologies, and see what they're about. They seem to be JS frameworks ( angular (which is a completely redesigned framework than angularjs!!!), reactjs, react-native (erm... this is for mobile apps), vue.js), programming languages (python, javascript, typescript,swift), and I'm surprised to see even machine learning tools (tensorflow, pandas, r, keras).

Anyway, this is an indication of what people are asking questions about. It may be because they're having trouble with it, or because they're working with the tools. Choosing to learn or to use one of the technologies above means you will have at least some of the questions answered :)

Part 2 (2018-02-01) - scraping job listing tags

To get a different picture, I scraped some of the StackOverflow jobs pages. They have tags, and I parsed them and sorted them. Here's the script I used (following my own tutorial from ages ago), and here are some results I got.

Note that the results are upside-down, with the most in-demand skills at the bottom (CLI-friendly). I cut off the ones below a score of 0.25 (less than 25% in-demand compared to the top skill, Java).


 ('typescript', 0.2532467532467532),
 ('html5', 0.2857142857142857),
 ('ios', 0.2857142857142857),
 ('css', 0.2857142857142857),
 ('sql', 0.2857142857142857),
 ('go', 0.3181818181818182),
 ('reactjs', 0.33116883116883117),
 ('c++', 0.33766233766233766),
 ('agile', 0.38311688311688313),
 ('security', 0.3896103896103896),
 ('node.js', 0.40909090909090906),
 ('sysadmin', 0.448051948051948),
 ('docker', 0.48701298701298695),
 ('php', 0.5259740259740259),
 ('amazon-web-services', 0.6948051948051948),
 ('python', 0.7467532467532466),
 ('linux', 0.8506493506493507),
 ('javascript', 1.0),
 ('java', 1.0)]
Total jobs: 47
Avg tags per job: 4.21276595745

Palo Alto

  ('networking', 0.25),
 ('tcp', 0.25),
 ('cloud', 0.25),
 ('user-experience', 0.25),
 ('java-ee', 0.25),
 ('platform', 0.25),
 ('flask', 0.25),
 ('android', 0.25),
 ('css', 0.25),
 ('ios', 0.25),
 ('mobile', 0.25),
 ('design', 0.25),
 ('user-interface', 0.25),
 ('microservices', 0.30000000000000004),
 ('spring', 0.30000000000000004),
 ('d3.js', 0.30000000000000004),
 ('html', 0.30000000000000004),
 ('mongodb', 0.30000000000000004),
 ('redux', 0.30000000000000004),
 ('c', 0.3375),
 ('c++', 0.3375),
 ('devops', 0.4),
 ('web-services', 0.45000000000000007),
 ('machine-learning', 0.45000000000000007),
 ('sql', 0.5),
 ('python', 0.55),
 ('linux', 0.55),
 ('mysql', 0.6000000000000001),
 ('amazon-web-services', 0.7000000000000001),
 ('ruby', 0.7000000000000001),
 ('reactjs', 0.8),
 ('javascript', 1.0),
 ('java', 1.0)]
Total jobs: 25
Avg tags per job: 4.32

Also note that the results might be customized by StackOverflow for me personally. I'd love your help if you wanted to run the script on your own HTML saved from the browser.

No comments:

Post a Comment

Share your insights!