Archive for big data

Technical Books I’m Reading

A Shelf of Technical Books

I try to keep current on technology. As weird as it may seem, to be an IT industry analyst, you don’t have to know much about technology. You can understand the market without knowing the technology that drives it. It’s limiting but possible.

To really understand IT customers – truly grok them – you need to live a bit in their world. It is my belief that understanding technology provides insights into the market.

More importantly, I like information technology, programming, and and all things geeky. It was my profession for many years before moving to the business side and my heart is still there. So, it is for myself as much as my clients and audience that I continue to go deep in technology.

I have also recently discovered Humble Bundle. They make collections of e-books, comics, and games available for a very low price and donate much of the proceeds to various charities. You can donate as little as US$1.00 and get four or five books. Check them out. They’re awesome.

Subsequently, I have been feasting on technical books on a variety of subjects. Besides my usual array of technology sites and news, here’s what I have been reading.

  • Head First Data Analysis, Michael Milton, O’Reilly – Semi-technical, accessible, introduction to the concepts of data science.
  • Doing Data Science, Cathy O’Neil and Rachel Schutt, O’Reilly – A more in-depth exploration of the process of data science.
  • Think Bayes, Allen B. Downey, O’Reilly – Tutorial on Bayesian statistics.
  • Think Stats, Allen B. Downey, O’Reilly – Tutorial on classical statistics.
  • Mastering Docker 2nd edition, Russ McKendrick and Scott Gallagher, Packt – Both introductory and advanced Docker concepts. Good starter for the budding container enthusiast.
  • Getting Started with Kubernetes, Jonathan Baier, Packt – Introduction and tutorial for Kubernetes.
  • Blockchain Basics: A Non-Technical Introduction in 25 Steps, Daniel Drescher, Packt – ntroduction to Blockchain. The non-traditional style was hard for me to get used to.
  • Mastering Blockchain, Imran Bashir, Packt – More traditional and in-depth introduction to Blockchain and major implementations of it such as Bitcoin and Ethereum. I’m reading this now.

I’ve got a lot of books coming up – I bought 41 of them for something like US$35 – including a set of Java books, and more on Cloud, Data Science, and Blockchain/Bitcoin. There’s a book on OpenStack that looks interesting. R in a Nutshell, Thoughtful Machine Learning with Python, and Java 8 Lambdas are all possibilities too. That assumes that Humble Bundle doesn’t wave something interesting in my face. I almost bought the last Python bundle but resisted. Oh, and I have a ton of Linux books waiting in the wings too.

Of course, the group above tracks my current interests. I’ve been writing code in Java since the 1990s when Java v1.0 was mostly a associated with adding applets to websites. Cloud, Containers, DevOps, Blockchain, and Data Science are top of mind for me professionally and the IT community as a whole. These books talk to the everyday work of developers which is what interests me the most.

So, I’m more than happy to settle in with a good book so long as it’s techy.

Machine Learning May Help Developers

IDE Pictures

This blog was originally published on the Amalgam Insights website.

As the fall season of tech conferences starts to wind down, something is quite clear – machine learning (ML) is going to be everywhere. Box is putting ML in content management, Microsoft in office and CRM, and Oracle is infusing it into, well, everything. While not a great leap forward on the order of the Internet, smartphones, or PCs, the inclusion of ML technology into so many applications will make a lot of mundane tasks easier. This trend promises to be a boon for developers. The strength of machining learning is finding and exploiting patterns and anomalies. What could be more useful to developers? Here is some examples:

  • Coding – The most obvious application of machine learning is in the coding of applications itself. Coding is based around patterns that are known to work (design patterns, best practices, etc.) and automating them is always going to be helpful. Automating the creation of new code, however, will have only incremental value at best. Modern IDE a have code completion library and API lookup, and automated code generation already. In other words, there is already plenty of features that help a developer to automate the more tedious and inefficient parts of the job. With the proliferation of APIs, SDKs, and code libraries in use, having more intelligent search is a useful application of ML. With machine learning, the IDE may be able to anticipate which APIs and libraries that a developer needs from the context of the code and suggest them.
  • Debugging – Where ML will probably help the most will be in debugging. Debugging code is the hardest part of software development. Often, debugging feels like trying to find the needle in the haystack. It’s even harder to debug someone else’s code and this is where ML will come in handy. Most developers have certain patterns to the mistakes they make. It’s human nature, like always drifting to the right when walking in the woods. ML would help to find individual programmer’s patterns and styles and be able to look for instances where a mistake is being generated. In addition, there are distinct patterns in good code and the ability to discover anomalies in those patterns would help to identify bugs quickly.
  • Testing – Another area where machine learning can help developers with managing test data. Intelligent creation of test environments, environments that mirror real world patterns, can be derived by analyzing production applications and developing test data sets. Test data created this way could match the range of situations an application typically might encounter without using actual extracted data. Using machine learning to create test data would give developers the kind of test data they need without having to deidentify real data or risk violating customer privacy.
  • Project management – large transformation projects pose considerable problems for project management. With teams spread out over distances, working on many parts of the project simultaneously, it can be difficult to coordinate resources and personnel to maintain development efficiency. Just getting a picture of the state of a large-scale project requires a number of people reporting on progress in addition to metrics gathered automatically by tracking systems. This can be highly inefficient. Much is left to the interpretation of generated data and subjective assessments of progress. Simple metrics, such as the burn-down rate, are interpreted in the context of individual manager’s goals and subject to bias. Development managers and project managers all have psychological factors that affect the assessment of a project and can delay acting on warning signs. Machine learning, on the other hand, can be used to analyze patterns in the development data over several parts of a project or over many projects to discern anomalies and warning signs. With this knowledge project managers will be able to manage dependencies, see the unhealthy signs of failure without bias, and gain assistance when rebalancing resources and addressing problems.

IT managers are still more interested in leveraging machine learning to enhance the product of their labors not the process. In time, developers and project managers will see the value in employing machine learning to help manage large projects, especially when spread over the globe. The gains in efficiency and early warning system alone are worth the effort.