Social Media

Enter your email address:

Delivered by FeedBurner

Search
  • Contact Me

    This form will allow you to send a secure email to the owner of this page. Your email address is not logged by this system, but will be attached to the message that is forwarded from this page.
  • Your Name *
  • Your Email *
  • Subject *
  • Message *
Navigation

Entries in Information Engineering (13)

Friday
Oct102014

Having the Best doesn't necessarily work if you don't have the knowledge that supports it

F1 Racing is the most technically advanced racing out there.  More money and more technology is thrown at winning that any other racing.  Back in the early 90s working at Microsoft there were a bunch of us who would get together at somebody’s house at 6a Sunday morning to watch the European F1 races.  One guy was so into F1, he quit Microsoft and joined the Ferrari race team to work on the computer systems in the cars.

McLaren racing dumped Mercedes engines for Honda in the 2015 season, and part of the reason is McLaren wanted the source for the engine systems.

"A modern grand prix engine at this moment in time is not just about sheer power; it's about how you harvest the energy, store the energy and effectively if you don't have control of that process - meaning access to source code - then you are not going to be able to stabilise your car in the entry to corners, for instance, and you lose lots of lap time. So even though you have the same brand of engine you do not have the ability to optimise the engine."

I have been out of following F1, but 2015 might be when I start following again.  Here is a Honda video they released on their 2015 engine.  Honda has bet on one team McLaren to win.  Which means they’ll be sharing everything they can to get the most performance out of their engine.

Friday
Jun062014

Two Things that will Make Your Data Center AI Projects Hard to Execute - Data & Culture

It was predictable that with Google sharing its use of Machine Learning in a mathematical model of a mechanical system that others would say they can do it too.  DCK has a post on Romonet and Vigilent being other companies that use AI concepts in data centers.

Google made headlines when it revealed that it is using machine learning to optimize its data center performance. But the search giant isn’t the first company to harness artificial intelligence to fine-tune its server infrastructure. In fact, Google’s effort is only the latest in a series of initiatives to create an electronic “data center brain” that can analyze IT infrastructure.

...

One company that has welcomed the attention around Google’s announcement is Romonet, the UK-based maker of data center management tools.

...

 Vigilent, which uses machine learning to provide real-time optimization of cooling within server rooms.

Google has been using Machine Learning for a long time and uses it for many other things like their Google Prediction API.

What is the Google Prediction API?

Google's cloud-based machine learning tools can help analyze your data to add the following features to your applications:

Customer sentiment analysis

Spam detection
Message routing decisions

Upsell opportunity analysis
Document and email classification

Diagnostics
Churn analysis

Suspicious activity identification
Recommendation systems

And much more...

Here is a Youtube video from 2011 where Google is telling developers how to use this API.

Learn how to recommend the unexpected, automate the repetitive, and distill the essential using machine learning. This session will show you how you can easily add smarts to your apps with the Prediction API, and how to create apps that rapidly adapt to new data.

So you are all pumped up to get AI in your data center.  But, here are two things you need to be aware of that can make your projects harder to execute.

First the quality of your data.  Everyone has heard garbage in - garbage out.  But when you create machine learning systems the accuracy of data can be critical.  Google’s Jim Gao, their data center “boy genius” discusses one example.

 Catching Erroneous Meter Readings

In Q2 2011,Google announced that it would include natural gas as part of ongoing efforts to calculate PUE in a holistic and transparent manner [9]. This required installing automated natural gas meters at each of Google’s DCs. However, local variations in the type of gas meter used caused confusion regarding erroneous measurement units. For example, some meters reported 1 pulse per 1000 scf of natural gas, whereas others reported a 1:1 or 1:100 ratio. The local DC operations teams detected the anomalies when the realtime, actual PUE values exceeded the predicted PUE values by 0.02 - 0.1 during periods of natural gas usage.

Going through all your data inputs to make sure the data is clean is painful.  Google used 70% of its data to train the model and 30% to validate the model.  Are you that disciplined?  Do you have a mechanical engineer on staff who can review the accuracy of your mathematical model?

Second, the culture in your company is an intangible to many.  But, if you have been around enough data center operations staff, their habits and methods are not intangible.  They are real and what makes so many things happen.  Going back to Google’s Jim Gao.  He had a wealth of subject matter expertise on machine learning and other AI methods in Google.  He had help deploying the models from Google staff.  And he had the support of the VP of data centers and the local data center operations teams.

 I would like to thank Tal Shaked for his insights on neural network design and implementation. Alejandro

Lameda Lopez and Winnie Lam have been instrumental in model deployment on live Google data centers.

Finally, this project would not have been possible without the advice and technical support from Joe Kava,

as well as the local data center operations teams.

Think about these issues of data quality and the culture in your data center before you attempt an AI project.  If you dig into automation projects it is rarely as easy as when people thought it would be.

Tuesday
Jun032014

Google's Data Center Machine Learning enables shaving Electricity Peak Demand Charges

A week ago I was able to interview Google’s Joe Kava, VP of Data Centers regarding Better Data Centers through Machine Learning.  The media coverage is good and almost everyone focuses on the potential for lower power consumption.

Google has put its neural network technology to work on the dull but worthy problem of minimizing the power consumption of its gargantuan data centers.

One of the topics I was able to discuss with Joe is the idea that accurately prediction of PUE and a mathematical model of the mechanical systems enables Google to focus on the Peak Demand during the billing period to reduce overall charges.  The above quote says power consumption is dull. What is focusing on peak power demand?  Crazy.  Or you understand a variable cost of running your data center. :-)

How you get billed is complicated and varies widely dependingUnderstanding Peak Demand Charges on your specific contract, but it’s important for you to understand your tariff. Without knowing exactly how you're billed for energy, it's difficult to prioritize which energy savings measures will have the biggest impact. 

...

In many cases, electricity use is metered (and you are charged) in two ways by your utility: first, based on your total consumption in a given month, and second, your demand, based on the highest capacity you required during the given billing period, typically a 15-minute interval during that billing cycle.

To use an analogy, think about consumption as the number that registers on your car’s odometer – to tell you how far you’ve driven – and demand as what is captured on your speedometer at the moment when you hit your max speed. Consumption is your overall electricity use, and demand is your peak intensity, or maximum “speed.”

National Grid does a great job explaining this: "The price we pay for anything we buy contains the cost of the product plus profit, plus the cost of making the product available for sale, or overhead.” They suggest that demand is akin to an overhead expense and note that “this is in contrast to charges…customers pay for the electricity itself, or the ‘cost of product,’ largely made up of fuel costs incurred in the actual generation of energy. Both consumption and demand charges are part of every electricity consumer’s service bill.”

When you think about the ROI of reducing your energy consumption the business people should understand the overall consumption and the peak demand of its operations.  Unfortunately it is all too common for people to focus only on the $/kWhr.

Google can look at the peak power consumption and see if there are ways the PUE could be improved to reduce the peak power for the billing period.

NewImage

Here are tips that can help you shave peak demand.

Depending on your rate structure, peak demand charges can represent up to 30% of your utility bill. Certain industries, like manufacturing and heavy industrials, typically experience much higher peaks in demand due largely to the start-up of energy-intensive equipment, making it even more imperative to find ways to reduce this charge – but regardless of your industry, taking steps to reduce demand charges will save money.

...

Consider no or low-cost energy efficiency adjustments you can make immediately. When you start up your operations in the morning, don't just flip the switch on all of your high intensity equipment. Consider a staged start-up: turn on one piece of equipment at a time, create a schedule where the heaviest intensity equipment doesn’t all operate at full tilt simultaneously, and think about what equipment can be run at a lower intensity without adverse effect. You may use more kWh – resulting in greater energy consumption or a higher “energy odometer” reading as discussed above – but you'll ultimately save on demand charges and your energy bill overall will be lower.

 

Thursday
May292014

Does Google's Data Center Machine Language Model have a debug mode? It should

I threw two posts(1st post and 2nd post) up on Google’s use of Machine Language in the Data Center and said I would write more.  Well here is another one.

Does Google’s Data Center Machine Language Model have a debug mode?  The current system describes the use of data collected every 5 minutes over about 2 years.

 184,435 time samples at 5 minute resolution (approximately 2 years of operational data

One of the methods almost no one does is debug their mechanical systems as if you were debugging software. 

Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected. Debugging tends to be harder when various subsystems are tightly coupled, as changes in one may cause bugs to emerge in another.

What would debugging mode look like in DCMLM (my own acronym for Data Center Machine Language Model)?  You are seeing performance that looks like the subsystem is not performing as expected.  Change the sampling rate to 1 second.  Hopefully the controller will function correctly at a higher sample rate.  The controller may work fine, but the transport bus may not.  With the 1 second fidelity make changes to settings and collect data.  Repeat changes.  Compare results.  Create other stress cases.

What will you see?  From the time you make the changes in a setting how long does it take for you to achieve the desired state.  At the 5 minute sampling you cannot see the transition and the possibly delays.  Was the transition smooth or a step function.  Was there an overshoot in value and then corrections?

The controllers have code running in them, sensors go bad, wiring connections are intermittent.  How do you find these problems?  Being able to go into Debug mode could be useful.

If Google was able to compare detailed operations of two different installations of the same mechanical system, then they could find whether there was a problem that is unique to a site.  Or they may simply compare the same system at different points of time.

Wednesday
May282014

Google's Machine Learning Application is a Tool for AI, but not AI

Not AI, it is machine learning a tool to support mathematical model of a data centers mechanical systems

 

If you Google Image Search “artificial intelligence” you see these images.

NewImage

This is not what Google’s data center group has built with an application of machine learning.  When you Google Image Search “neural network” you see this.

 

NewImage

 

Google’s method to improve the efficiency of its data centers optimizes for cost is a machine learning application, not as covered in the media an artificial intelligence system. Artificial Intelligence is easy for many to assume the system thinks.   Google’s Machine Learning takes 19 inputs, then creates a predicted PUE with 99.6% accuracy and the settings to achieve that PUE.

 

Problems to be solved

  1. The interactions between DC Mechanical systems and various feedback loops make it difficult to accurately predict DC efficiency using traditional engineering formulas.
  2. Using standard formulas for predictive modeling often produces large errors because they fail to capture such complex interdependencies.
  3. Testing each and every feature combination to maximize efficiency would be unfeasible given time constraints, frequent fluctuations in the IT load and weather conditions, as well as the need to maintain a stable DC environment.

These problems describe the difficulty to build a mathematical model of the system.

 

Why Neural Networks? 

To address these problems, a neural network is selected as the mathematical framework for training DC energy efficiency models. Neural networks are a class of machine learning algorithms that mimic cognitive behavior via interactions between artificial neurons [6]. They are advantageous for modeling intricate systems because neural networks do not require the user to predefine the feature interactions in the model, which assumes relationships within the data. Instead, the neural network searches for patterns and interactions between features to automatically generate a best fit model.

 

There are 19 different factors input that are inputs to the neural network

1. Total server IT load [kW]

2. Total Campus Core Network Room (CCNR) IT load [kW]

3. Total number of process water pumps (PWP) running

4. Mean PWP variable frequency drive (VFD) speed [%]

5. Total number of condenser water pumps (CWP) running

6. Mean CWP variable frequency drive (VFD) speed [%]

7. Total number of cooling towers running

8. Mean cooling tower leaving water temperature (LWT) setpoint [F]

9. Total number of chillers running

10. Total number of drycoolers running

11. Total number of chilled water injection pumps running

12. Mean chilled water injection pump setpoint temperature [F]

13. Mean heat exchanger approach temperature [F]

14. Outside air wet bulb (WB) temperature [F]

15. Outside air dry bulb (DB) temperature [F]

16. Outside air enthalpy [kJ/kg]

17. Outside air relative humidity (RH) [%]

18. Outdoor wind speed [mph]

19. Outdoor wind direction [deg]

 

There are five hidden layers with 50 nodes per layer.  The hidden layers are the blue circles in the below diagram.  The red circles are the 19 different inputs.  The yellow circle is the output predicted PUE.

NewImage

Multiple iterations are run to reduce cost.  The cost function is below.

NewImage

 

Results:

A machine learning approach leverages the plethora of existing sensor data to develop a mathematical model that understands the relationships between operational parameters and the holistic energy efficiency. This type of simulation allows operators to virtualize the DC for the purpose of identifying optimal plant configurations while reducing the uncertainty surrounding plant changes.

 

NewImage

 

Note: I have used Jim Gao’s document with some small edits to create this post.