Posts

Showing posts from December, 2018

The Link Between Big Data and Fraud Prevention in the Financial Sector

Even though digital technologies have done a lot to improve our overall financial situation, and to provide people with better, more convenient access to a number of different services and resources, the sad reality of today is that technological advancements also tend to bring about increased levels of fraud. It’s not hard to see why, either – honest people aren’t the only ones who benefit from such developments, and in fact, those who have the willingness to try breaking the system can often find themselves in a much better position to take advantage of its flaws. However, things aren’t that bad in general terms. Technological progress has also made it possible – and quite easy in some cases – to deal with fraud and to increase the level of confidence that customers have in the systems they are working with. We still have some way to go until we’re at a point where things can run truly smoothly, but at least we’re moving in the right direction. We have Big Data to thank for that

Apache Hive Warehouse Connector Use-Cases

1. Motivation The HiveWarehouseConnector (HWC) is an open-source library which provides new interoperability capabilities between Hive and Spark. In practice, Hive and Spark are often leveraged together by companies to provide a scalable infrastructure for data warehousing and data analytics. However, as they both continue to expand their capabilities, interoperability between the two becomes difficult. […] The post Apache Hive Warehouse Connector Use-Cases appeared first on Hortonworks .

What Does 2019 Hold For Big Data Analytics?

In this day and age, information is everywhere. We are buried under it, bombarded with information constantly. However, information is also power, it gives you an edge in whatever kind of business you are in. This is why big data analytics exists. You analyze, you process big sets of data, you sift through what is and is not important, and then take what you can use. Big data analytics has been a staple, in one form or another, since the very concept of business began. However, it changed and evolved, it moved in the same rhythm technology does. And now, since the internet has become so ubiquitous and everpresent, it's simply a cornerstone of anyone who wants to see success. But, taking all the above into account, you need to always be on your toes. The world constantly changes, the technological landscape morphs by the second, and 2019 will be no different. Read on below if you want to see what we believe the New Year will hold for this concept. Machine learning – Yes or No?

How Big Data is Making the Scientific Method Difficult to Replicate

There has been a growing concern among intellectuals in many scientific fields, from academic researchers to pharmaceutical scientists, about the lack of practical application of their published test results towards solving real-world problems. Although they are given enough funding to operate laboratories with well-calibrated analytical instruments and high-tech equipment, these scholars still struggle to produce valid data from their in-house projects. This crisis has severe consequences for scientists who follow the Scientific Method, which is still the foundation of research and development efforts. They begin by forming a theory that could be tested under specific circumstances, as in the variables that will be altered to see if the data changes. One case study from Bayer Healthcare involved a thorough review of 67 famous projects completed by the scientific community. To their dismay, only around 25% of the studies could be repeated in a different environment outside the lab fa

5 Cloud Trends That Will Dominate 2019

The nonstop rise of the cloud has been incredibly impressive thus far, but things are only just getting started for this nascent technology that’s already upended everything we know about business and personal storage. As the new year rapidly approaches, analysts and investors across the marketplace are erupting into speculation about which trends will continue to disrupt the cloud market as we know it today. Here are five cloud trends that will dominate 2019, and how this technology will continue to evolve as the rest of the 21st century gets underway. 1. The cloud skill gap will narrow There’s currently a huge dearth of cloud expertise in the market, largely because so many companies need IT-savvy cloud experts, but the labor market is only so deep when it comes to competent talent. This has resulted in the so-called “cloud skill gap,” which has frustrated economic growth for years and continues to be a thorn in the side of businesses of all shapes and sizes. Luckily, 2019 will

Google to Open AI Lab in Princeton and Collaborate with University Researchers

Image
Two Princeton University computer science professors will lead a new Google AI lab opening in January in the town of Princeton. The lab is expected to expand New Jersey’s burgeoning innovation ecosystem by building a collaborative effort to advance research in artificial intelligence. The lab, at 1 Palmer Square, will start with a small number of faculty members, graduate and undergraduate student researchers, recent graduates and software engineers. The lab builds on several years of close collaboration between Google and professors Elad Hazan and Yoram Singer, who will split their time working for Google and Princeton. The work in the lab will focus on a discipline within artificial intelligence known as  machine learning , in which computers learn from existing information and develop the ability to draw conclusions and make decisions in new situations that were not in the original data. Examples include speech recognition systems that transcribe a wide spectrum of voices, and se

The Most In-Demand Jobs of 2019 for Business and Technology

Pursuing an advanced education can help you create a satisfying and stable career. However, there are so many options and fields to choose from that it can feel a little overwhelming to make a decision about what kind of job you’d like to pursue. If you’ve been thinking about your next career move, then you’ve probably considered getting into the business or technology fields at least once. These industries have a lot of opportunity for growth, tend to offer excellent salaries, and are in high demand. It can be hard to find a career you enjoy that also offers you financial and job security, but it’s well worth the effort in the end. The good news? There’s lots of information out there about what kinds of skills employers will be looking for in the next few years. Need some possible career ideas to consider? Here are the most in-demand jobs that will be hot in business and technology in 2019. Data Scientist If you know anything about trends in business and technology, then you k

Query Federation with Apache Hive

Organizations commonly use a plethora of data storage and processing systems today. These different systems offer cost-effective performance for their respective use cases. Besides traditional RDBMSs such as Oracle DB, Teradata, or PostgreSQL, many organizations may use Apache Kafka for streams and events data, Apache Druid for real-time series data, and Apache Phoenix for quick […] The post Query Federation with Apache Hive appeared first on Hortonworks .

{Submarine} : Running deep learning workloads on Apache Hadoop

(This Blogpost is coauthored by Xun Liu and Quan Zhou from Netease). Introduction Hadoop is the most popular open source framework for the distributed processing of large, enterprise data sets. It is heavily used in both on-prem and on-cloud environment. Deep learning is useful for enterprises tasks in the field of speech recognition, image classification, […] The post {Submarine} : Running deep learning workloads on Apache Hadoop appeared first on Hortonworks .

2018 in Review: The Year of Transition is Over

Last year, I called 2018 the Year of Transition because technological developments are accelerating, but they had not yet reached the state of full maturity. Indeed, artificial intelligence did leap forward, with Uber developing a new machine learning technique and AI beating humans in the game Dota 2. As I correctly predicted, the ICO hype came to a near standstill. Among others due to the crypto bear market and additional scrutiny from regulators such as the SEC. We saw a new approach to data ownership with the launch of Solid PODs by Sir Tim Berners-Lee. Finally, we saw that quantum computing is developed a lot faster than many had expected. With the year almost over now, and this being my last article of 2018, let’s look back at some of my most popular articles of last year. A quick recap if you missed something in the past 12 months. To prepare you for the Year of Truth that is upon us: Blockchain Of course, I extensively wrote about blockchain this year as a lot has happene

Announcing the Release of the First HDP 3 Sandbox!

We are excited to announce the release of the first Hortonworks Data Platform (HDP) 3 Sandbox. The Hortonworks sandbox is a great way to test drive some of the latest features found in HDP 3. The sandbox, a single node environment, is packed with 100% open-source Apache Projects that will allow you to explore Big […] The post Announcing the Release of the First HDP 3 Sandbox! appeared first on Hortonworks .

Introducing Hive-Kafka integration for real-time Kafka SQL queries

Our last few blogs as part of the Kafka Analytics blog series focused on the addition of Kafka Streams to HDP and HDF and how to build, secure, monitor Kafka Streams apps / microservices. In this blog, we focus on the SQL access pattern for Kafka with the new Kafka Hive Integration work. Kafka SQL […] The post Introducing Hive-Kafka integration for real-time Kafka SQL queries appeared first on Hortonworks .

3 Technologies That Will Change the Construction Industry in 2019

Many technologies in development or being tested will likely forever change the industries they affect. Here are three examples of innovations that seem set to disrupt the construction sector in 2019 and beyond. 1. Improved Safety Equipment Construction workers know they have dangerous careers, and it's necessary to use protective equipment such as hard hats and safety harnesses depending on their tasks. Despite those precautions, statistics from the Occupational Safety and Health Administration collected in 2016 found 21.1 percent of worker fatalities in the private sector happened in the construction industry. But, safety-related wearables will make worksites safer and potentially reduce the associated fatalities. According to one survey, 82 percent of contractors using wearables in 2017 reported site safety improvements. That's often because those gadgets collect real-time information and send it to supervisors, allowing them to intervene when necessary. A clip-on

Spark Streaming: Understanding StreamingContext

Spark Streaming wasn't the first streaming architecture. Over time, multiple technologies have been developed in order to address various real-time processing needs. One of the first popular stream processor technologies was Twitter Storm, and it was used in many businesses. Spark includes the streaming library, which has grown to become the most widely used technology today. This is mainly because Spark Streaming holds some significant advantages over all of the other technologies, the most important being its integration of Spark Streaming APIs within its core API. Not only that, but Spark Streaming is also integrated with Spark ML and Spark SQL, along with GraphX. Because of all of these integrations, Spark is a powerful and versatile streaming technology. This tutorial has been taken from Big Data Analytics with Hadoop 3 written by Sridhar Alla and published by Packt. Note that you can find more information here on Spark Streaming Flink, Heron (Twitter Storm's succe

Monitoring Kafka Streams Microservices with Hortonworks Streams Messaging Manager (SMM)

In last week’s blog Secure and Governed Microservices with HDF/HDP Kafka Streams Support, we walked through how to build microservices with the new Kafka Streams support in HDF 3.3 and HDP 3.1 that is fully integrated with Ranger, Schema Registry and other platform services. This blog is all about monitoring these microservices with Hortonworks Streams Messaging […] The post Monitoring Kafka Streams Microservices with Hortonworks Streams Messaging Manager (SMM) appeared first on Hortonworks .

Big Data Processing Engines – Which one do I use?: Part 1

Special thanks to Bill Preachuk and Brandon Wilson for reviewing and providing their expertise Introduction Columnar storage is an often-discussed topic in the big data processing and storage world today – there are hundreds of formats, structures, and optimizations into which you can store your data and even more ways to retrieve it depending on […] The post Big Data Processing Engines – Which one do I use?: Part 1 appeared first on Hortonworks .

How Big Data is the Future of Business Security

Over time, instances of hacks seem to be breeding at a higher rate. Research in 2016 showed that over a million incidents of cyber crimes were reported in that year alone. Over time, malware attacks have become more challenging to fight and detect and are more sophisticated than before. One of the challenges that modern businesses are facing is to keep their network and systems protected from hackers and malware attacks. With these never-ending threats, even the most successful companies can't sustain their growth and performance. Big Data Some business owners believe that big data is a threat to their enterprises while others view it as a savior. It enables entrepreneurs to store large volumes of data, observe, analyze, and detect any irregularity within a system or network. That's what has made it a preferred choice in keeping cyber crimes at bay. The volumes of information available in big data reduce the time that an entrepreneur can take to detect and resolve a threat,

AI Index 2018: The AI Boom is Worldwide and Accelerating

Image
The rate of progress in the field of artificial intelligence is one of the most hotly contested aspects of the ongoing boom in teaching computers and robots how to see the world, make sense of it, and eventually perform complex tasks both in the physical realm and the virtual one. And just how fast the industry is moving, and to what end, is typically measured not just by actual product advancements and research milestones, but also by the prognostications and voiced concerns of AI leaders, futurists, academics, economists, and policymakers. AI is going to change the world — but how and when are still open questions. Finding from a group of experts were published last week inthe second annual AI Index, assembled by experts from Harvard, MIT, Stanford, the nonprofit OpenAI, and the Partnership on AI industry consortium, among others. The goal is to measure the field’s progress using hard data and to try and make sense of that progress as it relates to thorny subjects like workplace au

We Need to Stop Overhyping Deep Learning

Image
AI today is described in breathless terms as computer algorithms that use silicon incarnations of our organic brains to learn and reason about the world, intelligent superhumans rapidly making their creators obsolete. The reality could not be further from the truth. As deep learning moves from the lab into production use in mission critical fields from medicine to driverless cars, we must recognize its very real limitations as nothing more than a pile of software code and statistics, rather than the learning and thinking intelligences we describe them as. Every day data scientists build machine learning algorithms to make sense of the world and harness large piles of data into marketable insights. As guided machine assistance tools, they operate much like the large classical observation equipment of the traditional sciences, software microscopes and telescopes onto society. However, a physicist does not proclaim that their analysis software is alive and thinking its own thoughts abou

Machine Learning in Medical Imaging and Analysis

Artificial Intelligence and IoT Trends to Watch Out For In 2019

Do you know where do you fall in the IoT adoption curve? Well, it doesn’t matter as long as you are learning about different technologies complimenting IoT and its integration into existing systems. The following post emphasizes Artificial Intelligence and IoT trends to watch out for in 2019. Artificial Intelligence and Internet of Things (IoT) have already fallen into the endless pit of buzzword-vagueness. These technologies tend to evolve with each passing day. After consulting dozens of emerging technology executives and researchers and hours of combing the insights of primary market research firms, I have come up with the post that might help in concluding the fact that why using a myriad connected devices is crucial to surviving in the upcoming years. Let me show you a glimpse of how combining AI with the Internet of Things can work wonders for any business. 1. Automated vacuum cleaners: iRobot Roomba IRobot, one of the most successful automated vacuum released in the year 2

Open Hybrid Architecture: Running Stateful Containers on YARN

The Why In the previous blog, we talked about the Open Hybrid Architecture. This architecture decouples storage and computation, thus computation tasks need to access various types of storage systems. This requires the ability to mount an external storage volume onto a container so that the container can read/write data just like on a local file […] The post Open Hybrid Architecture: Running Stateful Containers on YARN appeared first on Hortonworks .

2x Faster BI Interactive queries with HDP 3.0

Hortonworks announced the general availability of HDP 3.0 this year. You may read more about it here. Bundled with HDP 3.0, Apache Hive 3 with LLAP took a significant leap as a Enterprise Ready Real time Database Warehouse with transactional capabilities that continues to serve BI workloads with lower latencies. HDP 3.0 comes with exciting […] The post 2x Faster BI Interactive queries with HDP 3.0 appeared first on Hortonworks .

Will the Blockchain Deliver a Decisive Blow to Financial Fraud or Fall Flat on its Face?

Technologists have been promoting blockchain technology as a sort of panacea that could save the financial sector and protect everyday banking customers from fraud. Economists and journalists have painted a different picture. They've claimed that distributed ledgers are an easy way of obscuring illegal transactions and point to several high-profile cases where criminals used cryptocurrencies to cover smuggling and gambling debts. Considering that some engineers have gone so far as to call blockchain technology nothing more than a passing fad, the truth may very well lie between these two extremes. There are, however, many organizations already deploying this technology to reduce the risk of fraudulent purchases. Systems Currently in Place to Prevent Fraud Since blockchain decentralizes the data structures used to govern economic transactions, it can define the concept of trust. Institutionalized trust providers like third-party banking authorities have long defined what a leg

Improving MySQL by Replicating to the In-Memory Database Tarantool

Replicating MySQL is one of the in-memory-database Tarantool’s killer functions. It allows you to keep your existing MySQL database while at the same time accelerating it and scaling it out horizontally. Even if you aren’t interested in extensive expansion, simply replacing existing replicas with Tarantool can save you money, because Tarantool is more efficient per core than MySQL. To read a testimonial of a company that implemented Tarantool replication on a large scale, please see here, as well as here. I wanted to point out at the outset that if you run into any trouble with regards to the basics of Tarantool, you may wish to consult the first two tutorials in the Tarantool 101 series, which can be found here and here. And please note that these instructions are for CentOS 7.5 and MySQL 5.7. They also assume that you have systemd installed and are working with an existing MySQL installation. Finally, a helpful log for troubleshooting during this tutorial is replicatord.log in /v

McAfee CTO On Election Hacking, Cryptojacking, Quantum Security

Image
Election hacking. Information warfare. Adversarial artificial intelligence. All worrisome topics racing through Steve Grobman’s head these days. But the McAfee chief technology officer seems surprisingly upbeat about the prospects of meeting these cybersecurity challenges—or at least putting up a good fight. I met Grobman at a coffee shop in downtown Boston last week. He was visiting from Texas to give a talk at the  AI World Conference and Expo . Grobman previously spent more than two decades working for Intel in California and held key cybersecurity positions there, including his current role as technology chief for McAfee while it was still part of Intel. (Intel acquired McAfee in 2010 for $7.7 billion, then spun the company out last year in a $4.2 billion deal that reportedly gave investment firm TPG 51 percent ownership and Intel a 49 percent stake.) As CTO of one of the world’s oldest and largest standalone cybersecurity companies, I was curious to pick Grobman’s brain about t

The State of Natural Language Processing – Giant Prospects, Great Challenges

Artificial Intelligence Could Help Predict Volcanic Eruptions

Image
By Paul Voosen, covering Earth and Planetary Science, Science Satellites are providing torrents of data about the world’s active volcanoes, but researchers have struggled to turn them into a global prediction of volcanic risks. That may soon change with newly developed algorithms that can automatically tease from that data signals of volcanic risk, raising the prospect that within a couple years scientists could develop a global volcano warning system. Without such tools, geoscientists simply can’t keep up with information pouring out the satellites, says Michael Poland, the scientist-in-charge of the U.S. Geological Survey’s Yellowstone Volcano Observatory in Vancouver, Washington, who was not involved in either study. “The volume of data is overwhelming,” he says. Andrew Hooper, a volcanologist at the University of Leeds in the United Kingdom who led the development of one method, says the new algorithms should benefit the roughly 800 million people who live near volcanoes. “Abou

3 Reasons Not to Use Blockchain Within Your Organisation

For the past years, blockchain has been a huge buzzword. Especially after the crypto hype of 2017, organisations were convinced that they had to do something with blockchain. If only it was changing your company name to include blockchain in it. Although we have are in a bear crypto market, blockchain remains a buzzword. It reminds me of 5 or 6 years ago, when big data was the buzzword of the days. Back then, every organisation thought they had to do something with big data, but they had no idea. There was even a famous slide stating: “Big Data is like teenage sex; everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…” Fast-forward five years and big data is no longer a buzzword, but big data has become a pre-requisite to understand the environment and remain competitive. Big data analytics can be used in every part of your organisation. Blockchain is different. Blockchain has the potential to fun

How to Correctly Analyze Your Data to Ensure Company Efficiency

The science of data research holds more weight over your company's effectiveness than one might generally expect. In fact, comprehensive sets of data and analytic information directly measure the performance of a business, and, therefore, future success depends heavily on a business's analytic metrics. Today, data is measured differently than it formerly had been within the world of business analytics. Everything has gone digital. No longer are accountants and bookkeepers necessary to determine how successful a company is and in what areas it might need to make changes. A company needs only to invest in the right analytics and market research software. Knowing how to identify the right data, however, is a far cry from knowing the right way to analyze that data in order to cultivate successful and effective business practices. The following are a few tips and tricks on how to correctly analyze your data and market research to ensure a company's efficiency. Understanding

How Big Data Can Unseat Big Players in the Stock Market

Have you heard of Fibonacci trading? It’s an investment strategy based on the Fibonacci sequence. A stock trader who favors the Fibonacci ratios — a high volatility or low volatility Fibonacci trader —  will sell or hold their position based on the ratios. The interesting thing about this strategy is the way in which it mirrors nature, which is an anomaly. For whatever reason, nature decided to organize structures according to the pattern the Fibonacci sequence describes. In turn, traders can base their strategy on a mathematical anomaly that corresponds with nature. Here’s the thing: before day traders could take advantage of advanced automated trading software, a trader who tried to manually employ the Fibonacci ratios was at the mercy of their own emotions. At times when a Fibonacci-based strategy is working, manual day traders can fall prey to either the gambler’s fallacy or the hot-hand fallacy. They can decide it’s time to change strategies because of the basic logical errors t

How AI has Helped The Finance Field Overcome Fraud

How would you like the ability to accurately predict financial crimes and prevent them before losing a penny? No, this isn't the plot of a sci-fi thriller or a superpower we can grant you. Artificial intelligence (AI) is changing the finance industry in ways that couldn't be imagined even a few years ago. We haven't quite attained Azmovian levels of prescience yet, but we're moving closer every day. Understanding How AI Works Artificial intelligence has not quite reached complete independence from human input; our machines are still only as smart as their programming. There are four types of AI, which can generally be divided into two categories. One is narrowly defined to be intuitive, but only within the context of its environment and purpose. The other type is the stuff from which sci-fi fantasies are made. It's able to learn, evolve, and make decisions over a range of applications completely independent of human input or interference. This type of AI has t

How Serverless Will Facilitate the Growth of Big Data Applications?

When it comes to designing the big data framework within the organization, serverless computing is coming off as a perfect solution. Setting up a dedicated server to facilitate your custom requirements is history now. To deal with the complex analytics workload, organization are often re-architecting their IT infrastructure. Moreover, the popular buzzword of “pay as you go” and “pay for what you use” is proving to be a major driving factor in accelerating the adoption of serverless architecture. With a sudden rise in adoption, it is becoming mainstream as more and more people are opting for the same. Serverless is definitely proving to be a big benefit for the organizations across the world. However, there are very few who are leveraging it to enable their big data solutions. Let’s analyze whether serverless is right for powering big data solutions or not. If yes, what are the major benefits it brings to the table? Understanding Serverless Serverless is a popular term which hou

Data Science & Engineering Platform: Data Lineage and Provenance for Apache Spark

This is the third in a series of data engineering blogs that we plan to publish. The first blog outlined the data science and data engineering capabilities of Hortonworks Data Platform. Motivation Apache Spark is becoming the de-facto processing framework for all kinds of complex processing including ETL, LOB business data processing and machine learning. […] The post Data Science & Engineering Platform: Data Lineage and Provenance for Apache Spark appeared first on Hortonworks .

What’s so great about Apache Ambari 2.7?

With Apache Ambari, our mission is to create and foster a 100% open source operations platform that allows teams to quickly deploy, secure, monitor and manage HDP, HDF, and our Hortonworks partner ecosystem products.  Whether you’re a customer with 5 nodes or 5,000, Apache Ambari gives you the enterprise feature set and tools needed to […] The post What’s so great about Apache Ambari 2.7? appeared first on Hortonworks .

Strategic-AI Visionary Metaphors: Knight Rider KITT and AI Self-Driving Cars

Image
By Lance Eliot, the AI Trends Insider For leaders overseeing any substantive AI initiative, it is crucial to establish a strategic vision of what you are ultimately trying to achieve. The strategic vision should layout the nature of the AI system that you are embarking upon creating and has to be relatively clear cut so that anyone involved will readily grasp your aims. If you fail to identify a strategic vision for the AI effort, the odds are that few will comprehend what you are seeking to build and field. Without a collective understanding among your AI developers, you can end-up with something that goes astray of your intention. Worse still, if the overall direction and purpose is muddled or not defined at all, you could wind-up with an untoward result, having wasted precious resources and time that otherwise might have led to better success. A strategic vision for your AI project can consist of a narrative that spells out in some detail the goals and objectives, plus it is han