Category Archives: Database

SAP unveils new powers within Analytics in the Cloud

SAP1SAP has unveiled a new user-friendly analytics service for enterprises which it claims will give better insights by offering an ‘unparalleled user experience’.

The SAP Cloud for Analytics will be delivered through a planned software as a service (SaaS) offering that unifies all SAP’s analytical functions into one convenient dashboard.

Built natively on the SAP HANA Cloud platform, it will be a scalable, multi-tenant environment at a price which SAP says is affordable to companies and individuals. The new offering aims to bring together a variety of existing services including business intelligence, planning, budgeting and predictive capacity.

According to SAP, it has fine tuned workflows so that it’s easier for user to get from insight to action, as one application spirits the uses through this journey more rapidly. It achieves this by giving universal access to all data, digesting it and forwarding the right components to the right organs of the organisation. An intuitive user interface (UI) will help all users, from specialists such as finance professionals to generalists such as line of business analysts, to build connected planning models, analyze data and collaborate. It can extend to unstructured data, helping users to spot market trends within social media and correlate them with company inventories, SAP claims.

It’s all about breaking down the divisions between silos and blending the data to make visualization and forecasting possible, said Steve Lucas, president, Platform Solutions, SAP. “SAP Cloud for Analytics will be a new cloud analytics experience. That to me is more than visualization of data, that’s realization of success,” said Lucas.

SAP said it is also working with partners to provide seamless workflows.

SAP and Google are collaborating to extend the levels of analysis available to customers, according to Prabhakar Raghavan, VP of Engineering at Google Apps. “These innovations are planned to allow Google Apps for Work users to embed, refresh and edit SAP Cloud for Analytics content directly in Google Docs and Google Sheets,” said Raghaven.

Amazon Web Services makes aggressive customer acquisition play

Amazon reinvent 2015At its Amazon re:Invent event Amazon Web Services (AWS) announced a number of products and initiatives designed to make it easier for potential customers to move their business to the AWS Cloud.

AWS Snowball is a portable storage appliance designed to be an alternative to trying to upload data over networks, claiming to be able to move 100 TB of data to AWS in less than a week. Amazon is betting that companies are neither willing to prioritise their existing bandwidth, nor devote the time to do this over the network. In addition the company launched Amazon Kinesis Firehose, which is designed to make it easier to upload wireless streaming data to the AWS cloud.

“It has never been easier or more cost-effective for companies to collect, store, analyze, and share data than it is today with the AWS Cloud,” said Bill Vass, VP of AWS Storage Services. “As customers have realized that their data contains key insights that can lead to competitive advantage, they’re looking to get as much data into AWS as quickly as possible. AWS Snowball and Amazon Kinesis Firehose give customers two more important tools to get their data into AWS.”

On top of these new products Amazon announced two new database services – AWS Database Migration Service and Amazon RDS for MariaDB – designed to make it easier for enterprises to bring their production databases to AWS, which seems to take aim at Oracle customers especially.

“With more than a hundred thousand active customers, and six database engines from which to choose, Amazon RDS has become the new normal for running relational databases in the cloud,” said Hal Berenson, VP of Relational Database Services, AWS. “With the AWS Database Migration Service, and its associated Schema Conversion Tool, customers can choose either to move the same database engine from on-premises to AWS, or change from one of the proprietary engines they’re running on-premises to one of the several open source engines available in Amazon RDS.”

Continuing the theme of taking on the big enterprise IT incumbents Amazon launched QuickSight, a cloud business intelligence service that would appear to compete directly with the likes of IBM, while aiming to undercut them with a low-price as-a-service model.

“After several years of development, we’re excited to bring Amazon QuickSight to our customers – a fast and easy-to-use BI service that addresses these needs at an affordable price,” said Raju Gulabani, VP of Database Services at AWS. “At the heart of Amazon QuickSight is the brand new SPICE in-memory calculation engine, which uses the power of the AWS Cloud to make queries run lightning fast on large datasets. We’re looking forward to our customers and partners being able to SPICE up their analytics.”

Lastly Amazon announced a new business group in partnership with Accenture that is also designed to make it easier for companies to move their business to the cloud. The Accenture AWS Business Group is a joint effort between the two and is another example of Accenture putting the cloud at the centre of its strategy.

“Accenture is already a market leader in cloud and the formation of the Accenture AWS Business Group is a key part of our Accenture Cloud First agenda,” said Omar Abbosh, Chief Strategy Officer of Accenture. “Cloud is increasingly becoming a starting point with our clients for their enterprise solutions. Whether our clients need to innovate faster, create new services, or maximize value from their investments, the Accenture AWS Business Group will help them get there faster, with lower risk and with solutions optimized for AWS.”

MapR claims JSON IoT development breakthrough

Cloud databaseEnterprise software vendor MapR has unveiled plans to slash the workload of IoT developers and administrators by cutting the complexity of managing its NoSQL databases.

The key to this simplification, it says, is in more creative use of the JavaScript Object Notification (JSON) format, which it claims has the potential to make significant improvements in both database scalability and the analysis of the information they contain.

“We’re seeing big changes in the way applications are developed and how data is consumed,” said MapR’s chief marketing officer Jack Norris, “the underlying data format is the key to making information sharing easier.”

Bringing out the advantages of JSON makes administration easier, according to Norris, because users can make changes easily in a database built on documents. This in turn helps developers when they are planning applications, because it is easier to create a user friendly system. Tweaking JSON will benefit system builders in their own work too, Norris argued, since a document database can now be given enterprise grade scalability, reliability and integrated analytics.

The organisational improvements include the ability to personalise and deliver better online shopping experiences, reduce risk and prevent fraud in real-time, improve manufacturing efficiencies and cut costs. Savings will come from preventing cluster sprawl, eliminating data silos and lowering the cost of ownership of data management, claims MapR. Meanwhile it has promised a productivity dividend from continuous analytics of real-time data.

The MapR-DB supports the Open JSON Application Interface (OJAITM), which is designed to be a general purpose JSON access layer across databases, file systems and message streams, enabling a flexible and unified interface to work with big data, claims MapR.

The addition of a document database capacity extends the NoSQL MapR-DB to cover more types of unstructured business data, said one analyst. This could make it faster and easier to build big data applications, without the burden of shuffling data around first.

“MapR continues to build on the innovative data platform at the core of its Hadoop distribution,” said Nik Rouda, senior analyst at the Enterprise Strategy Group.

Microsoft selects Ubuntu for first Linux-based Azure offering

AzureMicrosoft has announced plans to simplify Big Data and widen its use through Azure.

In a blog post, T K Rengarajan, Microsoft’s corporate VP for Data Platforms, described how the expanded Microsoft Azure Data Lake Store, available in preview later this year, will provide a single repository that captures data of any size, type and speed without forcing changes to applications as data scales. In the store, data can be securely shared for collaboration and is accessible for processing and analytics from HDFS applications and tools.

Another new addition is Azure Data Lake Analytics, a service built on Apache YARN that dynamically scales, which Microsoft says will stop people being side tracked from work by needing to know about distributed architecture. This service, available in preview later this year, will include U-SQL, a language that unifies the benefits of SQL with the expressive power of user code. U-SQL’s scalable distributed querying is intended to help users analyse data in the store and across SQL Servers in Azure, Azure SQL Database and Azure SQL Data Warehouse.

Meanwhile, Microsoft has selected Ubuntu for its first Linux-based Azure offering. The Hadoop-based big data service offering, HDInsight, will run on Canonical’s open source browser Ubuntu.

Azure HDInsight uses a range of open source analytics engines including Hive, Spark, HBase and Storm. Microsoft says it is now on general release with a 99.9 per cent uptime service level agreement.

Meanwhile Azure Data Lake Tools for Visual Studio will provide an integrated development environment that aims to ‘dramatically’ simplify authoring, debugging and optimization for processing and analytics at any scale, according to Rengarajan. “Leading Hadoop applications that span security, governance, data preparation and analytics can be easily deployed from the Azure Marketplace on top of Azure Data Lake,” said Rengarajan.

Azure Data Lake removes the complexities of ingesting and storing all of your data while making it faster to get up and running with batch, streaming, and interactive analytics, said Rengarajan.

Semantic technology: is it the next big thing or just another buzzword?

Most buzzwords circulating right now describe very attention-grabbing products: virtual reality headsets, smart watches, internet-connected toasters. Big Data is the prime example of this: many firms are marketing themselves to be associated with this term and its technologies while it’s ‘of the moment’, but are they really innovating or simply adding some marketing hype to their existing technology? Just how ‘big’ is their Big Data?

On the surface of it one would expect semantic technology to face similar problems, however the underlying technology requires a much more subtle approach. The technology is at its best when it’s transparent, built into a set of tools to analyse, categorise and retrieve content and data before it’s even displayed to the end user. While this means it may not experience as much short term media buzz, it is profoundly changing the way we use the internet and interact with content and data.

This is much bigger than Big Data. But what is semantic technology? Broadly speaking, semantic technologies encode meaning into content and data to enable a computer system to possess human-like understanding and reasoning. There are a number of different approaches to semantic technology, but for the purposes of this article we’ll focus ‘Linked Data’. In general terms this means creating links between data points within documents and other forms of data containers, rather than the documents themselves. It is in many ways similar what Tim Berners-Lee did in creating the standards by which we link documents, just on a more granular scale.

Existing text analysis techniques can identify entities within documents. For example, in the sentence “Haruhiko Kuroda, governor of Bank of Japan, announced 0.1 percent growth,” ‘Haruhiko Kuroda’ and ‘Bank of Japan’ are both entities, and they are ‘tagged’ as such using specialised markup language. These tags are simply a way of highlighting that the text has some significance; it remains with the human user to understand what the tags mean.

 

1 taggingOnce tagged, entities can then be recognised and have information from various sources associated with them. Groundbreaking? Not really. It’s easy to tag content such that the system knows that “Haruhiko Kuroda” is a type of ‘person’, however this still requires human input.

2 named entity recognition

Where semantics gets more interesting is in the representation and analysis of the relationships between these entities. Using the same example, the system is able to create a formal, machine-readable relationship between Haruhiko Kuroda, his role as the governor, and the Bank of Japan.

3 relation extraction

In order for this to happen, the pre-existing environment must be defined. In order for the system to understand that ‘governor’ is a ‘job’ which exists within the entity of ‘Bank of Japan’, a rule must exist which states this as an abstraction. This is called an ontology.

Think of an ontology as the rule-book: it describes the world in which the source material exists. If semantic technology was used in the context of pharmaceuticals, the ontology would be full of information about classifications of diseases, disorders, body systems and their relationships to each other. If the same technology was used in the context of the football World Cup, the ontology would contain information about footballers, managers, teams and the relationships between those entities.

What happens when we put this all together? We can begin to infer relationships between entities in a system that have not been directly linked by human action.

4 inference

An example: a visitor arrives on the website of a newspaper and would like information about bank governors in Asia. Semantic technology allows the website to return a much more sophisticated set of results from the initial search query. Because the system has an understanding of the relationships defining bank governors generally (via the the ontology), it is able to leverage the entire database of published text content in a more sophisticated way, capturing relationships that would have been overlooked by computer analysis alone. The result is that the user is provided with content more closely aligned to what they are already reading.

Read the sentence and answer the question: “What is a ‘Haruhiko Kuroda’?” As a human the answer is obvious. He is several things: human, male, and a governor of the Bank of Japan. This is the type of analytical thought process, this ability to assign traits to entities and then use these traits to infer relationships between new entities, that has so far eluded computer systems. The technology allows the inference of relationships that are not specifically stated within the source material: because the system knows that Haruhiko Kuroda is governor of Bank of Japan, it is able to infer that he works with other employees of the Bank of Japan, that he lives in Tokyo, which is in Japan, which is a set of islands in the Pacific.

Companies such as the BBC, which Ontotext has worked with, are sitting on more text data than they have ever experienced before. This is hardly unique to the publishing industry, either. According to Eric Schmidt, former Google CEO and executive chairman of Alphabet, every two days we create as much information as was generated from the dawn of civilisation up until 2003 – and he said that in 2010. Five years later and businesses of all sizes are waking up to this fact – they must invest in the infrastructure to fully take advantage of their own data. You may not be aware of it, but you are already using semantic technology every day. Take Google search as an example: when you input a search term, for example ‘Bulgaria’, two columns appear. On the left are the actual search results, and on the right are semantic search results: information about the country’s flag, capital, currency and other information that is pulled from various sources based on semantic inference.

Written by Jarred McGinnis, UK managing consultant at Ontotext

Salesforce would be more effective if it was more mobile, workers tell survey

Salesforce WearCustomer relationship management leader (CRM) Salesforce needs to improve the employee experience before its clients can get the most out of it, says a new report.

The advice comes in the fourth annual State of Salesforce report, from consultancy Bluewolf, a partner agency to world’s top CRM vendor. It suggests that while customers of companies that use Salesforce feel more connected, the users of the CRM system aren’t as happy. The main complaints are inconsistent data quality and a lack of mobile options. However, the majority of the survey sample plan to ramp up their investment in the system.

Based on the feedback from 1,500 Salesforce customers worldwide, the 2015-2016 report suggests that the concerns of employees should be the next priority for Salesforce as it seeks to fine tune its CRM software.

The demand for better mobility was made by 77 per cent of salespeople surveyed. Their most time-consuming task was identified as ‘opportunity management’ which, the report concludes, could be improved by better mobile applications. The study also says that employees were twice as likely to believe that Salesforce makes their job easier if it could be accessed from a mobile device.

Bluewolf’s report suggests that Salesforce’s priorities in 2016 should be to invest more three areas: the mobile workforce, predictive analytics and improving the sales team’s experience of using apps.

In the modern obsession with customer experience, it is easily forgotten that employees create the customer success, according to Bluewolf CEO Eric Berridge. “While innovation is essential to improving employee experiences, companies must combine it with data, design and an employee culture.”

However, the report does indicate that companies are happy with Salesforce, since 64 per cent plan to increase their budget. Half, 49 per cent, have at least two Salesforce clouds and 22 per cent have at least three. A significant minority, 11 per cent, say they are planning to spend at least half as much again next year on Salesforce services.
That investment is planned because 59 per cent of Salesforce users say the CRM system is much simpler to use than it was a year ago.

Meanwhile, many companies are taking the employee matter into their own hand, says the report. One in three companies has already invested in agent productivity apps and one in five is planning to invest.

Google, Microsoft punt big data integration services into GA

Big cloud incumbents are doubling down on data integration

Big cloud incumbents are doubling down on data integration

Google and Microsoft have both announced the general release of Cloud Dataflow and Azure Data Factory, their respective cloud-based data integration services.

Google’s Cloud Dataflow is designed to integrate separate databases and data systems – both streaming and batch – in one programming model while giving apps full access to, and the ability to customise, that data; it is essentially a way to reduce operational overhead when doing big data analysis in the cloud.

Microsoft’s Azure Data Factory is a slightly different offering. It’s a data integration and automation service that regulates the data pipelines connecting a range of databases and data systems with applications. The pipelines can be scheduled to ingest, prep, transform, analyse and publish that data – with ADF automating and orchestration more complex transactions.

ADF is actually one of the core components of Microsoft’s Cortana analytics offering, and is deployed to automate the movement and transformation of data from disparate sources.

The maturation and commoditisation of data integration and automation is a positive sign for an industry that has for a very long while leaned heavily on expensive bespoke data integration. As more cloud incumbents bring their own integration offerings to the table it will be interesting to see how some of the bigger players in data integration and automation, like Informatica or Teradata, respond.

IBM buys Compose to strengthen database as a service

IBM has acquired Compose, a DBaaS specialist

IBM has acquired Compose, a DBaaS specialist

IBM has acquired Compose, a database as a service provider specialising in NoSQL and NewSQL technologies.

Compose helps set up and manage databases running at pretty much any scale, deployed on all-SSD storage. The company’s platform supports most of the newer database technologies including MongoDB, Redis, Elastisearch, RethinkDB and PostgresSQL and is deployed on AWS, DigitalOcean and SoftLayer.

“Compose’s breadth of database offerings will expand IBM’s Bluemix platform for the many app developers seeking production-ready databases built on open source,” said Derek Schoettle, general manager, IBM Cloud Data Services.

“Compose furthers IBM’s commitment to ensuring developers have access to the right tools for the job by offering the broadest set of DBaaS service and the flexibility of hybrid cloud deployment,” Schoettle said.

Kurt Mackey, co-founder and chief executive of Compose said: “By joining IBM, we will have an opportunity to accelerate the development of our database platform and offer even more services and support to developer teams. As developers, we know how hard it can be to manage databases at scale, which is exactly why we built Compose –to take that burden off of our customers and allow them to get back to the engineering they love.”

IBM said the move would give a big boost to its cloud data services division, where it’s seeing some solid traction; this week the company said its cloud data services, one of its big ‘strategic imperatives’, saw revenues swell 30 per cent year on year. And according to a report cited by the IT incumbent and produced by MarketsandMarkets, the cloud-based data services market is expected to swell from $1.07bn in 2014 to $14bn by 2019.

This is the latest in a series of database-centric acquisitions for IBM in recent years. In February last year the company acquired database as a service specialist Cloudant, which built a distributed, fault tolerant data layer on top of Apache CouchDB and offered it as a service largely focused on mobile and web app-generated data. Before that it also bought Daeja Image Systems, a UK-based company that provides rapid search capability for large image files spread over multiple databases.

Will datacentre economics paralyse the Internet of Things?

The way data and datacentres are managed may need to change drastically in the IoT era

The way data and datacentres are managed may need to change drastically in the IoT era

The statistics predicting what the Internet of Things (IoT) will look like and when it will take shape vary widely. Whether you believe there will be 25 billion or 50 billion Internet-enabled devices by 2050, there will certainly be far more devices than there are today. Forrester has predicted 82% of companies will be using Internet of Things (IoT) applications by 2017. But unless CIOs pay close attention to the economics of the datacentre, they will struggle to be successful. The sheer volume of data we expect to manage across these IoT infrastructures could paralyse companies and their investments in technology.

The Value of Information is Relative

ABI Research has calculated that there will be 16 Zettabytes of data by 2020. Consider this next to another industry estimate that there will be 44 Zettabytes by 2020.  While others have said that humanity only produced 2.7 Zettabytes up to 2013. Bottom line: the exponential growth in data is huge.

The natural first instinct for any datacentre manager or CIO is to consider where he or she will put that data. Depending on the industry sector there are regulatory and legal requirements, which mean companies will have to be able to collect, process and analyse runaway amounts of data.  By 2019 another estimate suggests that means processing 2 Zettabytes a month!

One way to react is to simply buy more hardware. From a database perspective the traditional approach would be to create more clusters in order to manage such huge stores of data. However, a critical element of IoT is that it’s based on low-cost technology, and although the individual pieces of data have a value, there is a limit to that value. For example, you do not need to be told every hour by your talking fridge that you need more milk or be informed by your smart heating system what the temperature is at home.  While IoT will lead to smart devices everywhere, its value is relative to the actionable insight it offers.

A key element of the cost benefit equation that needs more consideration is the impact of investment requirements at the backend of an IoT data infrastructure. As the IoT is creating a world of smart devices distributed across networks CIOs have to make a decision about whether the collection, storage and analytics happens locally near the device or is driven to a centralised management system.  There could be some logic to keeping the intelligence locally, depending on the application, because it could speed up the process of providing actionable insight. The company could use low-cost, commoditised devices to collect information but it will still become prohibitively expensive if the company has to buy vast numbers of costly database licenses to ensure the system performs efficiently – never mind the cost of integrating data from such a distributed architecture.

As a result, the Internet of Things represents a great opportunity for open source software thanks to the cost effectiveness of open source versus traditional database solutions. Today, open source-based databases have the functionality, scalability and reliability to cope with the explosion in data that comes with the IoT while transforming the economics of the datacentre. A point which Gartner’s recent Open Source Database Management report endorsed when it said:  “Open source RDBMSs have matured and today can be considered by information leaders, DBAs and application development management as a standard infrastructure choice for a large majority of new enterprise applications.”

The Cost of Integrating Structured and Unstructured

There are other key considerations when calculating the economic impact of the IoT on the datacentre. The world of IoT will be made up of a wide variety of data, structured and unstructured. Already, the need for working with unstructured data has given rise to NoSQL-only niche solutions. The deployment of these types of databases, spurred on by the rise of Internet-based applications and their popularity with developers, is proliferating because they offer the affordability of open source. Yet, their use is leading to operational and integration headaches as data silos spring up all around the IT infrastructure due to limitations in these NoSQL-only solutions. In some cases, such as where ACID properties are required and robust DBA tools are available, it may be more efficient to use a relational database with NoSQL capabilities built in and get the best of both worlds rather than create yet another data silo.  In other cases, such has for very high velocity data streams, keeping the data in these newer data stores and integrating them may be optimal.

A key priority for every CIO is integrating information as economically as possible so organizations can create a complete picture of its business and its customers.  The Postgres community has been at the forefront of addressing this challenge with the creation of Foreign Data Wrappers (FDWs), which can integrate data from disparate sources, likes MongoDB, Hadoop and MySQL. FDWs link external data stores to Postgres databases so users access and manipulate data from foreign sources as if it were part of the native Postgres tables. Such simple, inexpensive solutions for connecting new data streams emerging along with the Internet of Everything will be critical to unlocking value from data.

The Internet of Things promises a great transformation in the ability of enterprises to holistically understand their business and customer environment in real time and deliver superior customer engagement.  It is critical, though, that CIOs understand the economic impact on their datacentre investments.  The IoT creates a number of new challenges, which can be addressed using the right technology strategy.

Written by Pierre Fricke, vice president of product, EnterpriseDB

Data-as-a-service specialist Delphix scores $75m in latest round

Delphix secured $75m in its latest funding round this week

Delphix secured $75m in its latest funding round this week

Data-as-a-service specialist Delphix announced this week that the company has concluded a $75m funding round that will be used by the company to bolster its cloud and security capabilities.

The funding round, led by Fidelity Management and Research Company, brings the total amount secured by the company to just over $119m since its 2008 founding.

Delphix offers what is increasingly referred to as data-as-a-service, though a more accurate way of describing it does is offer data compression and replication-as-a-service, or the ability to virtualise, secure, optimise and move large databases – whether from an application like an ERP or a data warehouse – from on-premise to the cloud and back again.

It offers broad support for most database technologies including Oracle, Oracle RAC, Oracle Exadata, Microsoft SQL Server, IBM DB2, SAP ASE, PostgreSQL, and a range of other SQL and NewSQL technologies.

The company said the additional funding will be used to expand its marketing activities and “aggressively invest” in cloud, analytics and data security technologies in a bid to expand its service capabilities.

“Applications have become a highly contested battleground for businesses across all industries,” said Jedidiah Yueh, Delphix founder and chief executive.

“Data as a Service helps our customers complete application releases and cloud migrations in half the time, by making data fast, light, and unbreakable—a huge competitive advantage,” he said.