EDF 2015

The AXLE Project exhibited at the European Data Forum 2015 in Luxembourg on the 16th and 17th November 2015. The conference was an opportunity to meet researchers, industry and community parties interested in the data economy and the challenges we face with Big Data.

EDF stand

Evert Jan Hoijtink (Portavita) and Simon Riggs (2ndQuadrant Ltd), the driving forces behind the two SMEs involved in the AXLE project “manned the stand”, promoting the project’s results.

We were also invited to partake in the poster session, with a presentation entitled “Big Data, Can it be tamed?”

Although this was our last formal dissemination event for the project which concluded on the 31st October 2015, the partners will continue to share the results of the projects at various academic and industry conferences in the future.

Final General Assembly – Barcelona, 1st – 2nd October 2015

The AXLE Project partners visited Barcelona again for the 3rd time for our last project face-to-face meeting. BSC proved to be excellent hosts during our final visit, and we took the opportunity to discuss evaluation (the focus of the last 6 months has been to test the enhancements we have made to hardware and software during the lifetime of the project) and to evaluate our progress.

Lan and Anze, University of Ljubljana, produced a great image of how they see the AXLE project partners collaborating – this has been redrawn from their flipchart sketch, but hopefully gets the idea across just as well…

AXLE_The_UL_PerspectiveThe story in a nutshell… Portavita produces extremely large data through their synthetic dataset generator (anonymised and de-identified) and dumps this into the PostgreSQL database. 2ndQuadrant order this data through the new features they have written for the PostgreSQL code (the majority of this is available in the latest release 9.5) to greatly enhance security, scalability and performance. At the same time, Barcelona SuperComputing Center and the University of Manchester have experimented with and modified hardware to improve query speeds. And the University of Ljubljana updated the data mining and visualization tool, Orange to a more user-friendly version (Orange 3 was released in early October 2015) to produce immediate respresentations of sampled data.

Big Data Analytics: Tablesample, Orange, 2UDA

Umair Shahid, PostgreSQL and Business Intelligence expert, has been instrumental in the development of 2UDA – a data analytics application suite that unifies databases, spreadsheets, data mining and data visualization in one seamless desktop installer – exploiting the collaboration between PostgreSQL and Orange 3. Read the blog here: http://blog.2ndquadrant.com/big-data-analytics-tablesample-orange-2uda/

ARMOR – a solution to prevent Row-Hammer data corruption in DRAMs

PhD student Mohsen Ghasempour has devised a solution to a problem that has been exercising industry specialists over the last couple of years.

Mohsen’s research (supervised by Dr Mikel Lujan and Dr Jim Garside): “ARMOR: A Run-time Memory hot-row detector to prevent Row-Hammer data corruption in DRAMs” is now the subject of a patent application filed on behalf of the University by UMIP, the University’s commercialisation arm. UMIP is also seeking partners to develop the technology into a full commercial offering.

What is ARMOR?
ARMOR is a hardware-solution to prevent Row Hammer Errors in DRAMs, designed and developed in the School of Computer Science. Row hammering can occur when a specific wordline of a DRAM cell is activated repeatedly within a refresh interval (Row-Aggressor). In this situation the neighboring cells leak charge at a faster rate than expected. Thus, the retention time of such cells becomes less than refresh cycle (e.g. 64 ms) which means that these cells may lose their data (charge) before the refresh happens (Row-Victims). Therefore, during refreshing process the corrupted data will be read and written back again to the DRAM cell. The main challenge to mitigate the Row-Hammer effect is to monitor the number of activations for each row in the DRAM, which imposes a significant storage overhead to the memory system. ARMOR monitors the activation stream at the memory interface level and detects which specific rows (i.e. hot rows) are at risk of being “hammered” at run-time. ARMOR is capable of detecting all the possible hot-rows in a system with a minimal storage overhead.

Why ARMOR is a Promising Solution?
• capable of detecting all the possible Row Hammer errors with a high level of confidence
• provides precise information about the hammered rows (addresses) and the number of activations with a high level of accuracy (e.g. 99.99%)
• it does not need to know about the logical to physical mapping of DRAMs in order to mitigate Row Hammer error (ARMOR Cache Solution)
• scalable according to the size of memory
• technology independent and can easily support future device technologies.

For more information see: http://apt.cs.manchester.ac.uk/projects/ARMOR/RowHammer/index.html


Data Processing on FPGAs

by Geoffrey Ndu, University of Manchester

FPGAs (Field-Programmable Gate Arrays) are reprogrammable silicon chips that can be used to implement custom execution engines. FPGAs are becoming mainstream. For instance, IBM POWER8 processor has the Coherent Accelerator Processor Interface (CAPI) which allows an FPGA accelerator to coherently attach to the fabric of a POWER8 chip and access up to 1 TB of system memory [1]. Microsoft is using FPGAs to accelerate workloads in the datacenter [2].

In AXLE, we are looking into how the potential power of FPGAs could be harnessed by everyone, especially small and medium firms without the resources of the technology giants, to   speed up analytics on relational databases. We have identified programming FPGAs with software-like languages instead of traditional hardware description languages as a route for database kernel developers to access the power of FPGAS. Programming FPGAs using high level software languages allows a designer to work more productively at a higher level of abstraction and achieve faster time-to-market.

We have explored the trade-offs involved in using high level languages using to program FPGAs in the context of relational databases. The results of our study is detailed in this paper [3]. We are developing CHO [4], [5]; a benchmark suite that could be used to evaluate FPGAs that support programming with OpenCL. OpenCL is an open, royalty-free, parallel programming framework for writing applications that execute across heterogeneous platforms consisting of Central Processing Units (CPUs), Graphics Processing Units (GPUs), FPGAs and other processors. This paper [4] provides more information about our benchmark suite as well as a gentle introduction to computing with FPGAs.

[1]. OpenPOWER CAPI Developer Kit for POWER8

[2]. A. Putnam, A. Caulfield, et al. Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services. In Proceedings of the 41st International Symposium on Computer Archiecture (ISCA 2014).

[3]. O. Abella, G. Ndu, et al. An Empirical Evaluation of High-level Synthesis Languages and Tools for Database Acceleration. In Proceedings of the 24th International Conference on Field Programmable Logic and Applications (FPL 2014).

[4]. G.Ndu, J.Navaridas and M. Lujan. CHO: A Benchmark Suite for OpenCL-based FPGA Accelerators. University Manchester Technical Report UNIMAN-COMP-APT-TR-02-05-2014.

[5]. G.Ndu, J.Navaridas and M. Lujan. CHO: Towards a Benchmark Suite for OpenCL FPGA Accelerators. In Proceedings of the 3rd International Workshop on OpenCL.

Alpha Release of 2ndQuadrant Unified Data Analytics (2UDA)

2ndQuadrant is pleased to announce the alpha release of a new client platform for business intelligence – 2UDA (pronounced ‘tudor’).

2UDA is a data analytics application suite that unifies databases, spreadsheets, data mining and data visualisation in one seamless desktop installer.

Available to download now, 2UDA is packaged and ready to install with a choice of operating systems: Linux, OSX and Windows at http://2ndquadrant.com/2uda

Although primarily aimed at Data Analysts and Data Scientists, we encourage trial participation from developers and DBAs at this stage of the product development, to help us improve the function and user experience for all users.

More details on 2UDA at http://2ndquadrant.com/2uda