[slides] The Dark Side of IoT | @ThingsExpo #BigData #IoT #M2M #API

Personalization has long been the holy grail of marketing. Simply stated, communicate the most relevant offer to the right person and you will increase sales. To achieve this, you must understand the individual. Consequently, digital marketers developed many ways to gather and leverage customer information to deliver targeted experiences.
In his session at @ThingsExpo, Lou Casal, Founder and Principal Consultant at Practicala, discussed how the Internet of Things (IoT) has accelerated our ability to monitor behavior, gather even more customer data and potentially deliver outstanding experiences. Regrettably, as history demonstrates, there are countless examples of how this can go terribly wrong. Perhaps it’s time to learn from our mistakes.

read more

IDC outlook sees global cloud IT infrastructure revenue at $7.7bn for Q216

(c)iStock.com/alengo

According to the latest research from analyst house IDC, worldwide cloud IT infrastructure revenue hit $7.7 billion (£6.05bn) in the second quarter of 2016, at a growth rate of 14.5%.

The data, which comes from the company’s latest worldwide quarterly cloud IT infrastructure tracker, found Hewlett Packard Enterprise (HPE) remains top of the tree for global infrastructure vendor revenue, with the same market share – 16.4% – as this time last year, although not growing as quickly as Dell and Cisco, in silver and bronze medal position respectively.

EMC is clear in fourth with a plethora of providers – Lenovo, NetApp, IBM, Huawei, and Inspur – in the 2.5% to 3.3% market share, with IDC placing them all in joint fifth. It’s worth noting too that, with EMC now being owned by Dell, putting their two revenue figures together would put them ahead of HPE.

In terms of the three primary infrastructure products – Ethernet switch, storage, and server – growth in private and public cloud infrastructure growth was primarily driven by the former. Ethernet switch grew 49.4% and 61.8% year over year respectively. Interestingly, while Latin America saw the fastest regional growth at 44.0% annually, Western Europe (41.2%) and Japan (35.4%) – two relatively mature markets – were the next fastest growers. In comparison, the US (6.7%) trailed.

IDC argues that despite these figures there was something of a slowdown in the first two quarters of the year, with the finger of blame being pointed at hyperscale. The rest of the year does have a positive outlook, however. “As expected, the hyperscale slowdown continues in the second quarter of 2016,” said Kuba Stolarski, research director for computing platforms at IDC. “However, deployments to mid-tier and small cloud service providers showed strong growth, along with private cloud buildouts.

“In general, the second quarter did not have as difficult a compare to the prior year as the first quarter did, and this helped improve growth results across the board compared to last quarter,” Stolarski added. “In the second half of 2016, IDC expects to see strengthening in public cloud growth as key hyperscalers bring new data centres online around the globe, continued strength in private cloud deployments, and declines in traditional, non-cloud deployments.”

Breaking boundaries: How Freightos achieved high speed graph search in the cloud

(c)iStock.com/suriyasilsaksom

The problem

Freightos’s application runs heavy-duty graph algorithms against a very large dataset. This requires some unusual design principles, which no cloud platform today is optimised for. But we succeeded in building the application on top of the Google App Engine Flexible Environment, a new offering from Google, positioned somewhere between Platform as a Service (PaaS) and Infrastructure as a Service (IaaS). Even Flexible Environment, which is still in beta, does not yet fully meet the needs of high-speed large-scale graph traversal, but we can say that it is moving in the right direction and may yet be the first platform to do this.

How it began

When Freightos started to build a global freight-shipping marketplace, there were two choices in the Google Cloud Platform for deploying our Software as a Service, along with the parallel offerings from Amazon Web Services (AWS).

The first was the Google Compute Engine (GCE), an IaaS, where we would implement plumbing ourselves; the other was Google App Engine, a PaaS which would give us the plumbing ready to use.

Designed specifically for web apps, App Engine bundles an application server, HTTP request-handling, load balancing, databases, autoscaling, logging, monitoring, and more. Google automatically scales, upgrades, and migrates App Engine instances as needed.

If we had gone with GCE, we would have had to integrate and maintain all these components ourselves, on top of the operating system. This would require writing scripts to install a Java VM and other runtime plumbing; as well as integrating with remote services like databases, whether served from Google Cloud Platform or elsewhere.

The convenience of the PaaS was attractive, and Google’s strength is in App Engine, so we went with that.  App Engine is easier to use than GCE. But that ease comes with restrictions. App Engine’s design, aimed at web apps, imposes some strict limitations. State is not meant to persist memory across HTTP requests except in session caches, and a maximum of 1GB of RAM is made available to applications. Requests must finish within 60 seconds (extendable to 10 minutes with task queues for asynchronous execution) and instances must warm up within 60 seconds as well. Spawned threads also must have a short lifetime, finishing by the end of the HTTP request that spawns them. Long-running background threads are not possible.

App Engine imposes other limitations to achieve isolation of instances and automated lifecycle management. A customised Java VM with a special Security Manager keeps code tightly sandboxed. There’s no access to the underlying OS, including file system, sockets, and more. Persistence, networking, and other services that usually come from the OS have to go through provided APIs. In fact, you can only use a specific white-list of classes from the Java Runtime Environment.

Running into App Engine’s walls

Way back at the beginning, when Freightos offered only a small selection of freight services, App Engine’s constraints were not a problem.

But as Freightos grew to manage millions of freight prices, spanning air, ocean and land around the globe, we ran into a wall, as our application’s needs went way beyond App Engine’s assumptions.

The most important assumption that came up against is that applications hold limited data in memory. But our application needed quick access to 200 GB worth of route data–a lot more than App Engine allows. Of course Freightos’ need for routing in big graphs is not unique, and is shared by travel and mapping applications. Freightos does have extra dimensions in the data though. Travel applications route people, and mapping apps route vehicles. With freight on the other hand, weights, volumes, and commodities impact the routing and pricing of each leg.

Our application works on a Google Datastore instance holding millions of potential routes from a large and changing collection of freight services. On each new quote request, a uniform-cost graph search algorithm traverses the network of shipping legs, optimising for price, transit time, and other variables.

As the data set for the freight routes grew to massive scale, we implemented several new approaches for accessing the data fast enough to quickly meet requests for quotes.

Loading the entire dataset into memory on each request is prohibitively slow, especially without locally attached SSD storage. Accessing these gigabytes from Memcache or other caching mechanisms provides good performance for some parts of the functionality, though still not fast enough for good user experience for full graph traversals.

Ideally, the application would lazily load only what’s needed for a specific request for a freight quote. This requires predictive algorithms to know what to load, since a route might potentially go through any shipping lane worldwide. One quote request might be from Shanghai to New York, and another from London to Sydney, and the optimal path depends on what ships are sailing and available pricing. Databases don’t let you index paths of a graph, but on the basis of massive historical data on access patterns, it’s possible to target the lazy loading at most commonly accessed paths and so optimise the response times for common requests.

Precomputation of top routes can also work, once these predictive algorithms in place, though there are too many combinations to compute all possible freight routes and all possible loads in advance. Ultimately selective caching and selective precomputation are of little use when the enormous search space is so fragmented and diverse.

One part of the solution is to load all relevant objects into RAM on initialisation.  Though it’s a relatively rare architecture, there has been a trend towards holding lots of data into memory as RAM becomes less expensive. Redis, for example, is a popular in-memory database that uses massive RAM for high-speed data access. However, loading all the data takes far longer than App Engine’s 60-second limit, and is also too slow for Google’s approach to scaling, maintenance, and upgrade, which involves the frequently starting and stopping of instances with little warning.

Just when our application was about to give up under the strain, an early version of the App Engine Flexible Environment (back then called Managed VMs) became available. This variant on App Engine removed the restrictions on threads, startup time, Java-class access, and memory size. Our application could now take as long as it needed to load data, and hold up to 200 GB in memory. It still benefitted from the plumbing of App Engine, like Datastore, BigQuery, and logging, and all the existing API calls to the App Engine worked without change.

Google App Engine Flexible Environment

We have already described the challenges of an application that needed superfast access to 200 GB of routing data to run uniform-cost graph search algorithms. Google App Engine Flexible environment let us do this by loosening some of the restrictions of the standard App Engine, particularly the limits on memory and thread lifespan. But Flexible Environment, as we will explain, was still not quite flexible enough.

We needed little access to new resources that Flexible Environment opened up, like local filesystem and sockets; but we did need more of what we already had, like RAM and thread lifetime. Partly this was because our application was developed within App Engine’s limits, but it was also because App Engine really does provide a good variety of services and does it well. And if we really had decided to access the operating system directly for any but trivial requirements, we would just have moved to GCE.

Is it infrastructure or is it platform?

Flexible Environment is somewhere between IaaS and Paas. On the one hand, it is based on GCE, letting you use, for example, the full RAM of a GCE instance, and allowing you to ssh into the server. Just as you can deploy Docker on Google Container Engine, you can swap in your own Docker container on Flexible Environment, including customisations like a different Java VM. (We did that, to let us switch on the G1 garbage collection algorithm more suitable to a big-heap application.)

But Flexible Environment is not really an IaaS. It is better seen as a variant of the legacy App Engine PaaS (now renamed “App Engine Standard Environment”). Flexible Environment has all of App Engine’s APIs and some of its limitations. This means that you cannot fully leverage the power of the IaaS virtual machines (VMs) on which it is implemented. For example, the 200GB maximum is of little use when instances can be restarted without warning.

Though you can treat Flexible Environment as an IaaS and break out of the sandbox and work on the OS level, it is rare that you need to do that. If you do, consider moving to GCE.

The Flexible Environment is a bit stiff

We started with Flexible Environment when it was first available in alpha; it’s developed since then, but it’s still in beta and is not optimised for large-scale in-memory data.

Google shuts down the instances weekly for updates to the OS and infrastructure libraries, which causes problems for an application that takes tens of minutes to warm up. Though we arranged to configure this to happen less frequently, in doing so we lost an important advantage of Flexible Environment, the automatic maintenance and scaling. And Google still does shut down instances for maintenance, so that without special efforts, all the replicated instances of a single application can be down at the same time.

Bulk-loading data is not easy. The App Engine API and the Google Datastore implementation assume you’ll query data sequentially by indexes, rather than loading most of a large dataset in parallel.

The load-balancing algorithm is inflexible, detecting only whether an application is healthy or not. That may work for short-lived App Engine instances that are either up or down, but with long-lived instances, load balancers need to direct traffic to the healthiest of the available instances, based on current memory load, processing load, or any other parameter that makes sense for the needs of the application. Google makes these capabilities available in the HTTP Load Balancer, but only in GCE, not in the Flexible Environment.

Deploying Docker containers is doable, but nowhere near as easy as in GKE, the Google Container Engine. As mentioned, we set up a Docker container with a customised version of the Java 7 runtime with the the G1 garbage collector. To do this, we had to move from the convenient App Engine-specific SDK to the more generic, and less convenient command-line Google Cloud SDK. The process is convoluted, requiring multiple steps of building, both locally and in the cloud.

Ultimately, an application on Flexible Environment is running on plumbing that you did not design and cannot control, just as with PaaS. Though you can work around those challenges, say, by turning off the weekly upgrades, or using a different database, breaking out of the default assumptions loses some of the benefits of the App Engine Flexible Environment. In that case, GCE is the better fit.

The future platform for in-RAM processing in the cloud 

Many specialist applications run complex algorithms such as traversing large graphs quickly. Examples include our marketplace for international shipping, as well as travel and mapping applications. These algorithms need a combination of approaches for accessing that data: Fast querying, caching, and pre-loading. The fastest way to access data is in memory, but scaling up memory is not easy. This raised some special requirements that is not fully met by today’s cloud platforms.

The market is wide-open for a vendor who can meet these requirements in a PaaS-like layer.

– Loading massive data into memory quickly. This requires parallelisation and a data layer that can iterate over a dataset fast, in chunks.

– Smart load balancing which respects the complex behaviour of long-lived server instances.

– Minimising the stopping and starting of instances for maintenance and scaling. When restarting an instance is absolutely necessary, all the rest should remain available. Since restart is slow, it is preferable to enable “hot” changes: automated memory and library updates at runtime.

– Implementing garbage collection that works on the assumption of huge, rarely-changing datasets. (The Java VM’s G1 is a step in that direction.)

Why not IaaS?

The best solution today is to build and fine-tune a solution yourself, typically on a IaaS layer like Google Cloud Engine (GCE) or AWS Elastic Compute Cloud (EC2), using as big an instance as possible. Google offers up to 420 GB in new beta offering, and Amazon up to 2 TB.

If App Engine’s architecture, aimed at a very specific type of web application, is too restrictive, then why not just move to GCE?

In our case, we had already made heavy use of App Engine’s services, such as memcache, logging, task queues, email service, and did not want to leave them behind.

But there are more reasons to get as much functionality as possible from a PaaS layer. To use GCE, you have to do a lot more coding. For API integrations to Google services like Datastore, Memcache, and BigQuery, you use the less convenient remote client APIs, which are automatically generated from JSON, as compared to the built-in App Engine SDK in native Java. You write scripts to install the Java VM and other plumbing, and to deploy builds. There is a performance overhead too.

Integrating with underlying services is a hassle. For a service as simple as logging, for example, you set up a log analytics environment like ELK Stack – itself composed of storage, indexer, and search engine, and graphical tools. You then manage all these over time, making sure they get just the right amount of RAM, disk, I/O and other resources to meet requirements but keep down expenses. You do the same for every other component, including messaging queues and database. You also have to take care that each component is upgraded regularly and is compatible with all the others.

Staying in Flexible Environment lets you avoid all that. There may ultimately be no choice, but that will depend on whether Flexible Environment matures to fully support the special needs of a memory-intensive application.

Halfway to breaking the boundaries

It’s the old “best of breed” vs “integrated systems” dilemma, and as usual, neither answer is always right. Fast traversal of large datasets has special requirements, and though Flexible Environment handles some of these better than App Engine, it’s not quite there yet. Perhaps as it comes out of beta, it will allow a lifecycle that respects non-trivial initialisation times.

When that happens, Google App Engine Flexible Environment will be the first cloud platform that breaks out of the usual assumptions and enables this broad category of applications whose needs are not yet met by cloud providers. Amazon and Microsoft, listen up.

Read more: Google announces eight new cloud regions and greater customer integration

Parallels Attending SIMO Educación in Spain

  As its name suggests, the SIMO Educación Learning Technology Exhibition is all about learning. It is an event that features learning opportunities, lectures, talks, and insights on how to best use the technological solutions available. Tackling a wide range of topics, from hardware to storage to printing and network systems to digitization and virtualization, […]

The post Parallels Attending SIMO Educación in Spain appeared first on Parallels Blog.

Volta sees post-Brexit opportunity as data centre expansion revealed

(c)iStock.com/Tuomas Kujansuu

IP EXPO Never mind Brexit – London-based Volta Data Centres says an opportunity is afoot after launching another floor of its data centre site.

The vendor, which operates off a single site in central London, says enquiries have gone up by around 50% since the referendum result in June. Jonathan Arnold, Volta managing director, explains: “I think those [enquiries] that are linked to Brexit are very much looking at planning for the future because there is clearly still that unknown of what will happen next year.

“There are customers looking at their options, so that [they think] ‘if we put this in central London, we know that’s definitely in the UK, especially looking at us because we’re that single site, they know that it can’t be anywhere else. I do think there is an element of planning out there, people working out what to do before they make some decisions next year.”

Getting the expansion and the customer interest alongside it is music to Volta’s ears – the company says they had 40 serious leads from last year’s IP EXPO closing “a few” of them, and hoping for similar this year – yet the company is also focusing on European strategy.

In another announcement, Volta is forging a partnership with Luxembourg-based LuxConnect; staying true to its UK base while acknowledging a European opportunity. The deal was announced at a reception held in the Luxembourg Embassy earlier this week, with the two companies sharing carriers in BT, Cogent, Colt, Level 3 and Verizon.

“We’ve been talking to each other for quite a while – pre-Brexit, may I add – so it’s very much [the idea that] we are a single site data centre in London, they’re four sites but they’re all in Luxembourg,” says Arnold.

“It’s kind of natural we’re getting enquiries from customers saying do you know anyone in Europe, they’re getting equal enquiries…we don’t compete, we’re both pure play colocation, we don’t offer managed services which is absolutely key to us, and in the conversation that we had, we have similar cultures between the two businesses so it makes sense to form an alliance and see where that takes us.”

For the new additions to the company’s space in London, Arnold says Volta will be moving customers over in the coming two to three weeks, with capacity of between 370 and 400 racks depending on the layout. Naturally, the Brexit referendum result and the uptick as a result was somewhat unexpected, but Arnold is philosophical. “We’re not sure next year what’s going to happen, although if it carries on from this year we’re definitely seeing quite significant growth for us as a business,” he says.

“We’ve got space and availability for customers to move in very quickly, so we’ll just see what happens.” 

Akamai Extends its Acquisition Spree

Akamai, a leader in Content Delivery Services (CDN) is on an acquisition spree. On October 4, it acquired a California-based startup called Soha Systems for an undisclosed amount. Founded in 2013 by Haseeb Budhani,  Soha Systems provides enterprise secure access for the cloud. Recently, it had raised just under $10 million in venture funding from Andreesen Horowitz and Menlo Ventures, and was considered a successful tech startup. This was an all-cash deal, though the exact amount was not revealed by either companies.

This acquisition comes on the heels of another acquisition that was made last week by Akamai, when it bought a New York-based startup called Concord Systems for another all-cash deal. Earlier, it had also acquired companies like Bloxx and Prolexic respectively.

These acquisitions are expected to give a big boost to the operations of Akamai, as it looks to consolidate its position as the leading provider of content delivery network in the world. Currently, it has one of the largest distributed computing platforms in the world, and this company alone is responsible for 15 to 30 percent of all digital traffic.

Soha systems, an innovator in the sphere of cloud security, was a natural partner for Akamai, as the latter looks to provide more secure enterprise applications in the cloud. Over the last few years, Akamai has been looking to make a foray into cloud enterprise security applications, which acts as a natural complement to its web traffic services. In fact, providing employees with secure access to enterprise applications is a core component of web access, especially with the enterprise trend of moving more applications to the cloud. In this sense, Soha systems’ security service is expected to be the perfect product for Akamai’s expansion plans.

Also, there have been many rumors that Akamai is preparing to itself to be a potential acquisition target for Microsoft or Google, as they look to take on competition from Amazon’s AWS.  Since Soha Systems offers secure access delivered as a service, this acquisition is likely to boost its chances of getting acquired by one of the big companies in the near future.

Further, this acquisition can help Akamai to tap into a new business segment – cloud security. Today, more companies are looking to move their applications to the cloud for better performance, and in such a scenario, security becomes absolutely vital. IT teams today, are grappling with providing security, access, and performance, without compromising one for the other. Akamai’s strategy can go a long way in providing security across a range of different devices, and this way, it can help its enterprise customers to make the most of key trends driving the cloud and mobile services.

This deal is expected to give Soha Systems a boost too, besides the cash deal for founders. Soha’s cloud security is likely to be a high-value component of Akamai’s massive global platform, thereby giving it more visibility on a global stage. The existing operations of Soha Systems along with its employees and clients would not be affected, according to a press release from Akamai.

The post Akamai Extends its Acquisition Spree appeared first on Cloud News Daily.

Announcing @MangoSpring to Exhibit at @CloudExpo | #IoT #M2M #InfoSec

SYS-CON Events announced today that MangoApps will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device.

read more

Test, test, then test again: Analysing the latest cloud DR strategies

(c)iStock.com/XtockImages

In September, we hosted a roundtable with fifteen business leaders to discuss and debate the findings from our The State of IT Disaster Recovery Amongst UK Businesses survey.  The debate was chaired by Chris Francis, techUK board member. Customers Wavex and Bluestone also participated in the discussion as did our partner Zerto and industry influencers Ray Bricknell from Behind Every Cloud and analyst, Peter Roe from TechMarketView.  The event was lively and thought provoking. 

Outages definitely happen more frequently than we think.  We ran through the scale of outages that had been reported in the press in just the last month including organisations like British Airways, ING Bank and Glasgow City Council.British Airways lost its check in facility due to a largely unexplained ‘IT glitch’,  ING Bank’s regional data centre went offline due to a fire drill gone wrong (reports suggest that more than one million customers were affected by the downtime), and Glasgow City Council lost its email for three days after a fire system blew in the Council’s data centre.

Our survey backed up the high frequency of outages showing that 95% of companies surveyed had faced an IT outage in the past 12 months.  Interestingly four fifths, or 87% of that 95% who suffered outages, considered them severe enough to trigger a failover. We looked at some of the reasons for those outages and top of the list were system failure and human error. So, it is often not the big headlines we see such as environmental threats, storms or even a terrorist threat that brings our systems down, but more day-to-day mundane issues.  The group also suggested that often it was at the application level that the issues occur rather than the entire infrastructure being taken down. 

We also discussed the importance of managing expectations and how disaster recovery should be baked in rather than seen as an add on. Most businesses have a complex environment with legacy systems so they can’t really expect there to be 100% availability all of the time. That said, the term disaster recovery can scare people so those around the table felt that we should really talk more about maintaining ‘Business as Usual’ and resilience. DR isn’t about failing over an entire site anymore, it’s actually about pre-empting issues, for example testing and making sure that everything is going to work before you make changes to a system.

The discussion moved on to the impact of downtime. The survey found that every second really does count. When we asked respondents about the impact of downtime and how catastrophic this was, 42% said near seconds would have a big impact. This statistic rose to nearly 70% when it came to minutes. The group’s advice was that businesses really need to focus on recovery times when looking at a DR solution. We also talked about how much budget is spent on meeting recovery goals. The reality is that you can’t pay enough to compensate for downtime, but for most businesses there will always be some kind of trade-off between budget and downtime.

The group discussed whether business decision makers really understand the financial impact of downtime. Is more education needed about recovery times, what can be recovered, and prioritising different systems so the business understands what will happen when outages take place?

We then moved on to look at overconfidence in DR solutions.  The survey found that 58% had issues when failing over despite 40% being confident that their disaster recovery plans would work. Only 32% executed a failover and were confident and it all worked well. 10% did not failover but were confident that it would work well.   We talked to the group about this misplaced confidence and that while IT leaders know the importance of having a DR solution and taking measures to implement one, there appears to be a gap between believing your business is protected in a disaster and having that translate to a successful failover.

The bottom line is that DR strategies are prone to failure unless failover systems are thoroughly and robustly tested.  Confidence in failover comes down to the frequency that IT teams actually perform testing, and whether they are testing the aspects that are really important, such as at the application level.  Equally are they testing network access, performance, security and so on?  We certainly believe that testing needs to be done frequently to build evidence and a proven strategy. If testing only takes place once a year or once every few years then how confident can organisations be?

The group agreed that the complex web of interlocking IT systems is one of the biggest inhibitors to successful testing. While testing may be conducted on one part of a system in isolation, if that fell over this can often trigger a chain of events in other systems that the organisation wouldn’t be able to control.

The group agreed that there is an intrinsic disconnect between what management wants to hear in terms of DR recovery times and what management wants to spend.

In conclusion, we discussed the need to balance downtime versus cost as no one has an unlimited budget. A lot of the issues raised in the survey are challenges that can be traced directly back to simply not testing enough or not doing enough high quality testing.  The overall advice that iland recommends from the survey is to test, test and test again – and, importantly, to make sure that DR testing can be performed non-intrusively so that production applications are not affected, is cost-effective and does not place a large administrative burden on IT teams.

Editor’s note: You can download a copy of the survey results here.

Transparent Cloud Computing Consortium to Exhibit at @CloudExpo Silicon Valley | #IoT #Cloud #BigData

SYS-CON Events announced today that Transparent Cloud Computing (T-Cloud) Consortium will exhibit at the 19th International Cloud Expo®, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
The Transparent Cloud Computing Consortium (T-Cloud Consortium) will conduct research activities into changes in the computing model as a result of collaboration between “device” and “cloud” and the creation of new value and markets through organic data processing High speed and high quality networks, and dramatic improvements in computer processing capabilities, have greatly changed the nature of applications and made the storing and processing of data on the network commonplace.

read more