2010 | Qbase Tech

Oct 20, 2010

A little insight into analytics

Posted by QbaseTechTeam at 10:22 AM Labels: analytics, data mining 0 comments

Qbase provides a wide range of analytical services. In today's post, we'll take a look at one of the standards of analytics: predictive analytics. Predictive analytics fall into two main categories: statistical and data mining. Now, statistical approaches emerged from statistical theory, as opposed to data mining, which emerged from computational theory. Both statistical analytics and data mining are rooted in mathematics and share many of the same techniques. However, statistical approaches work to uncover the relationships and correlations among the variables in the predictive models, while data mining (also known as “machine learning”) focuses on the predictive capability of the model without regard to explanation. Data mining techniques do not require the user to know the relationships or patterns in the data, nor do they necessarily lead to simple explanations of the relationships. Today’s data mining tools and applications incorporate many of the statistical algorithms traditionally associated with a statistical approach. The tools leave the decision, appropriately, to the analyst.

Sep 16, 2010

Working with trauma data

Posted by QbaseTechTeam at 3:02 PM Labels: healthcare 0 comments

The Greater Dayton Area Hospital Association, or GDAHA, is working with Qbase to aggregate its members’ trauma data online for uncomplicated and immediate analysis. The system is securely hosted on the web for access from anywhere and is password protected and roles-based. The accompanying image (left; click to see a larger version) provides an example of analytics capabilities Qbase built for GDAHA. The chart demonstrates output from a much larger analytical engine that provides insights into performance and standard of care issues. The chart shows that, for Healthcare Provider “A,” there has been an increase in the distribution of trauma events among 20 – 30 year olds, as well as those over 81 years old, during the analysis period. In addition, the chart shows that, during the same analysis period, there has been a decrease in the distribution of trauma events among those under one year old and those between the ages of 71 – 80. If you're interested about learning more about what Qbase is doing in the healthcare space, visit our healthcare pages on the Qbase website.

Aug 28, 2010

You need a plan

Posted by QbaseTechTeam at 2:05 PM Labels: data protection; IT services 0 comments

For most companies, this isn't news, but having a thorough backup and recovery plan is a necessary component to include in your overall business continuity plan. Why? Well, disasters are never scheduled, but they are always unpredictable and, for the most part, inevitable. And there are many causes for data loss, but the most common are hardware failures, natural disasters, human disasters, viruses, and software issues. So, to decrease risk and enhance data protection, you should address what steps to take before a disaster occurs and data has been lost. What's the best backup and recovery plan for your business? To figure that out, identify causes by first performing a risk assessment. We can help you with that (you saw that coming, didn't you?). IT services. 24/7 Help desk. Virtual IT departments. Onsite hardware and data hosting. It's what we do. Just ask some of our customers: including the US Air Force, Konecranes, Springfield Cancer Center.

Jul 13, 2010

The real threat of viruses and spyware

Posted by QbaseTechTeam at 8:43 AM Labels: data encryption, data protection, IT services 0 comments

Viruses and spyware can bring your laptop to a grinding halt. Think that’s the extent of it? Think again. Most spyware is designed to record your surfing habits by sampling files and history on your machine and then sending the results to a remote location for analysis. Some virus infections can corrupt and destruct your data. Another breed of viruses steals your data and spies on your daily activities. It allows a hacker to log everything you type, see your computer screen, watch your web camera, browse and copy your files, and more! Computerworld’s Jeremy Kirk explains that, “A 10-month cyber espionage investigation has found that 1,295 computers in 103 countries belonging to international institutions have been spied on” (read the full article here). Kirk references a 53-page report published March 29, 2009, titled “Tracking GhostNet: Investigating a Cyber Espionage Network.” This report describes the GhostNet network that “primarily uses a malicious software program called ghostRAT (Remote Access Tool) to steal sensitive documents, control Web cams and completely control infected computers.” GhostNet represents just one example of the millions of cyber threats that exist today. As the risks increase, it’s crucial for you to implement small business and enterprise solutions, including antivirus and anti-spyware tools, to serve as a solid protection for your business.

Jun 9, 2010

Human performance studies

Posted by QbaseTechTeam at 10:11 AM Labels: healthcare; human effectiveness; data analysis 0 comments

At Qbase, our primary areas of focus are in data improvement and IT support. However, our secondary areas of focus are just as important: data analysis and software tools development. Important areas where Qbase has contributed significant insight through both the development of purpose-built software and data analysis are in healthcare and human performance studies. For example, to help improve performance and reduce the possibility for injury to our warfighters, Qbase is working in partnership with other organizations — including the Kettering Medical Center Network (KMCN) and the University of Dayton Research Institute (UDRI) — to assist the US Air Force Research Lab (AFRL). Together, the team is using a combination of advanced data analysis tools, imaging technology and nano-based bio-markers to study sleep deprivation and performance. This may eventually have broad application across a number of public and private areas, such as commercial piloting, the trucking industry, and driver safety. Through this Imaging Tools for Human Performance Enhancement and Diagnostics initiative and partnership, Qbase is helping to build tools that will benefit the medical software diagnostics community in the future. More on this partnership and the results of this study, which is still ongoing, in future posts.

May 17, 2010

A bit about "HTC"

Posted by QbaseTechTeam at 12:39 PM Labels: HTC, large data sets 0 comments

For those of you working with HTC — or High Throughput Computing — we thought we'd share a bit about how we do things at Qbase in the HTC arena. First of all, you might be asking, what is High Throughput Computing or, at least, how is it used in relation to processing data? Well, at Qbase, we've designed our HTC system as a solution to increase the processing of large data sets by leveraging cluster-aware, simultaneously-executing applications that handle petabytes of data for immediate analysis and decision making. In other words, no matter how large the data set, we can handle it (and at record speeds). Now, we don't mean to brag....

Apr 30, 2010

Our touch table environment

Posted by QbaseTechTeam at 1:48 AM Labels: GIS, GPS, sensors, touchtable 0 comments

In today's world, where terrorism and natural disasters are an all-too-common reality, situational awareness has become, both locally and nationally, of paramount importance. So, for the past few years Qbase has been developing a number of technologies that together form an advanced GIS-based, sensor fusion environment known as TouchStone®, purpose-built for proactive situational awareness. This environment incorporates as its user interface a 3D GIS system implemented in NASA’s World Wind. The interface provides integration of multiple layers of aerial imagery, ground-based surveillance imagery (fixed and mobile), integration of external GIS feeds, and tracking of GPS-enabled devices. The interface also supports the reading and writing of synthetic GIS layers. Everything in the system is both geographically and temporally synchronized and presents a video recorder-style interface with rewind, pause, and fast-forward through the data with the ability to jump back to the live feeds on demand. TouchStone® provides unprecedented access to real-time data from multiple sources for decision-making and crisis monitoring and prevention for any type of crisis situation. This is an ongoing project, and we are looking for partners to take TouchStone® to the next level. Right now, TouchStone is being evaluated by the DHS as well as police departments around the U.S. Contact us for more information if your organization has technologies that would integrate well with TouchStone®, or have a need for the superior technologies found only in the TouchStone® system.

Mar 31, 2010

An incremental approach to data conditioning works best

Posted by QbaseTechTeam at 9:39 PM Labels: data cleansing, data conditioning 0 comments

The Qbase approach to, and benefits of, data conditioning. Click to view slightly larger image.

It's one of the truths about data that data can't take care of itself — especially when some or all of it has been collected in legacy environments and/or with little or no control over how it was collected. Taking care of your data, what's known as data conditioning, is valuable regardless of whether you're planning system modernization, data migration, or any other large, data-related project. While data problems can be huge and can seem insurmountable, you will have success with data conditioning if you address these problems in small, manageable pieces. This incremental approach, which Qbase uses and has had much success with, lets you expose critical information about the data is the most effective way. Thus, informed decisions can be made concerning risks: Do we move ahead with the data in its current state? Or do we first condition the data for improved results? We recently added another tool to assist in conditioning of your data over time. Qbase Data Drift Analysis™ allows you to take snapshots of the condition of your data at different points in time, and then to compare those profiles of your data. Identifying trends in Data Drift Analysis™ allows you to correct flawed software or procedures which may have a negative impact on your data — all this before the impact becomes a serious data quality issue. So, the bottom line when it comes to data is this: To avoid data-related risks and surprises, establish an ongoing data profiling and conditioning process to be proactively aware of the state of your data. At Qbase, we have the expertise and tools, such as Data Drift Analysis™, to lend a hand.

Mar 9, 2010

A bit about our standard data improvement processes

Posted by QbaseTechTeam at 3:09 PM Labels: data cleansing, data discovery 0 comments

One of the primary areas of focus here at Qbase is data, specifically on what is known as "data hygiene" — literally turning dirty data into clean data. When we receive data from clients requesting data hygiene either as a standalone process or as just one step in a larger project (such as a data migration or integration project), the first thing we always do is examine the data (typically this involves using our organic software, such as Qbase Data Discovery™). We then process the data files, which almost always results in a significant improvement to the data. Some of the standard processes we perform to enrich and improve our clients' data include:

De-duping records to remove or correct duplicate records
Distinguishing an individual record (a person) from a business
Parsing person names into standard fields; in other words, identifying and breaking out the title (Adm.), first name, last name, suffix (Jr.), and so on
Identifying invalid or undeliverable addresses and correcting or updating these addresses where possible
Parsing address components into standard fields; again, breaking out the house number, the pre-directional if used (East), street name, the suffix (NW), and all of that

If you have dirty data, Qbase has an unmatched data hygiene system to clean your data, which is the first, best step for any data-related project.

Feb 26, 2010

The trouble with laptops

Posted by QbaseTechTeam at 9:44 AM Labels: data protection 0 comments

Walk into your local coffee shop or favorite lunch place and you’ll see laptops and "netbooks" at practically every table. Welcome to the wireless world! A world, unfortunately, that is fraught with danger, where corporate and personal information can be stolen right out from under your keyboard. The reason? Most wireless "hotspots" use weak or no encryption/security. While this makes it easy for you to connect, it also makes it easy for hackers to spy on you. And once you’re online at your favorite WiFi hotspot, hackers can quietly probe your computer, stealing passwords, data — even your identity. But things are not as dire as this scenario makes it seem. Hackers can be stopped if you use a firewall. But you need to be selective. Some firewalls are almost (but not quite) more trouble than they're worth with constant popups and alerts and the dramatic performance slowdown they inflict on your system. The best plan, then, may be to turn to an expert to help you select and properly configure a solution. At Qbase, for example, our engineers can configure personal firewall solutions for your mobile workforce that strike a delicate balance between restrictive and functional. In fact, in most cases you won’t even know the firewall is working. But all of those frustrated hackers will...

Feb 18, 2010

The secret to effective data disaster recovery

Posted by QbaseTechTeam at 2:26 PM Labels: data protection 0 comments

As a data company, Qbase is not focused exclusively on collecting, merging, cleansing, and migrating data. We are also keen on preserving data. A key component of data preservation is having a workable recovery plan. However, once you have a plan is in place, don't do what a lot of businesses do: relax and let the backup solution run “on its own.” (There they sit, falsely confident that all backups are complete and readable, thinking to themselves, "What could possibly go wrong?") What you should do instead, because you're not complacent, because you're smarter than that, is to be just a bit more proactive with your disaster recovery system. How? By creating an easy-to-follow procedure for quickly verifying and auditing your data backups. What should this plan include? At the very least, regular testing to verify that your data restores correctly to various temporary locations. It should also include testing to verify the data consistency of the restored files. A truly good plan will also include a strategy for auditing daily backup logs, as well as a contingency plan for when — not if! — one of your backups does not complete successfully. Be proactive and you'll find yourself in a much better position should disaster strike down the road. Only then can you truly and confidently relax.

Feb 10, 2010

Our software & database development expertise

Posted by QbaseTechTeam at 10:38 AM Labels: applications, healthcare, QIP 0 comments

The user interface for the Qbase Information Platform™. Click to view larger, clearer image.

The Qbase Information Platform™, or QIP™, is a hosted data repository two years in conjunction with our partners at GDAHA, the Greater Dayton Area Hospital Association. QIP™ allows Ohio’s largest hospital association and its 20+ healthcare organizations to aggregate, condition, and view critical emergency room operations data up to 95% faster than before this technology was developed and deployed. This puts the power of the data in the hands of the people who need it, so they can make timely operational decisions that affect how the Dayton region delivers emergency healthcare services to its population. Development was an iterative process to ensure that Qbase designed exactly what the experts at GDAHA required. The system’s main feature is its easy-to-navigate dashboard that allows for the drilling down on and filtering of the data to expose trends otherwise “buried” in the statistical databases. The graphical dashboard presents data from multiple databases within a secure online system as a managed repository. The data can then be analyzed from different perspectives using intuitive drop-down menus, buttons, and check boxes which enable the user to include or exclude specific data sources and views. Data, rather than being restricted to rows and columns of raw numbers as in most data views, now, within QIP™, display as color-coded, dynamic visuals. These visuals — which include bar charts, pie charts, and other familiar graphic types — are fully customizable in size, shape, format, color and of course, data content. Most importantly, the full capabilities of the QIP™ system can be transferred to other hospital operational functions quite easily — it’s all about the data.

Feb 1, 2010

Our data transformation expertise

Posted by QbaseTechTeam at 11:32 AM Labels: data transformation 0 comments

Today's post is all about data transformation, and how Qbase does it differently. Sure, there are a whole lot of commercial transformation engines out there, but what makes Qbase better? Well, it really comes down to the technological advancements we have developed for our transformation engine: Qbase Data Transformer™. These advancements are in the following areas: in our "non-blocking" data flow architecture; in our use of late binding based on the requirements of the data; in our leveraging of deliberate error processing (which allows processing to continue when certain known errors are encountered); and, finally, in our use of distributed HTC, where jobs run across multiple servers. Interested in the details on these differentiators? Read on...

Data flow architecture — Qbase Data Transformer™ has been designed to run "non-blocking" (at least as much as the data flows allow). This allows large data sets to be writing finished results to output files even while original data sets are still being read.
Late binding — Rather than making rigid requirements regarding which fields are required and in what order, we've designed QDT™ for maximum flexibility with respect to data source changes. We did this by basing data field requirements on a runtime binding against the requirements of the data flow. So, as long as the fields required by the transformation are present in some order, the job will run successfully. Any additional data not required by the flow is passed on unchanged.
Deliberate error processing — By "deliberate error processing," we mean that each component in a QDT™ job is "aware" of explicit error types that allow bad data to continue to flow through the system, retaining its original form and error status. Many systems kick out records and fields as errors occur, but QDT™ can let these continue to flow through the system allowing their corresponding records to participate in subsequent processing (where it still makes sense, that is).
Distributed High Throughout architecture — QDT™ jobs are able to run with individual components distributed across multiple servers in an HTC setup. This provides greater parallel processing and allows the job to move processing to systems which may have specific resources unavailable on other servers. When combined with the data flow architecture, this results in quite a boon to overall throughput.

Find out how we can put these superior technologies to work for you today.

Jan 28, 2010

Our large scale production capabilities

Posted by QbaseTechTeam at 4:49 PM Labels: queries, scale, search response 0 comments

We're often asked, as a small business, if we have the ability to handle massive amounts of records. Specifically, what is our background and track record in this area? Well, in terms of just scale, Qbase personnel have tackled some of the largest search challenges out there, including a system comprised of over 10,000 distinct sources of structured data collections with over 8 billion records. In addition to tackling enormous collections of data, Qbase has designed, built, and deployed flexible solutions that provide a balance between high-precision and wide-range recall. These solutions have search response times of less than one second and support over 100 queries per second. Our ability to deliver these solutions is due in large part to our team members' unique skills in data preparation and design, as well as in high throughput computing, data compression, search algorithms and system architecture. See the difference a talented team with years of large-scale production experience can make on your next project.

Jan 27, 2010

Unique problems of diverse data sets

Posted by QbaseTechTeam at 2:28 PM Labels: queries, search response 0 comments

If you are in the search or data business, or collect this data for yourself, you know firsthand about the search and data challenges found in diverse data sets from military, healthcare, law enforcement, research, insurance, and financial domains, to name just a few. But data from each of these different domains present a unique set of challenges because, although they may share some characteristics, they are actually quite different when viewed in depth. Qbase excels at processing legal data, for example, because our personnel have a deep understanding of the citation and precedence system interwoven into case law documents. We successfully process news data, on the other hand, because our technicians have set up systems to handle the constant stream of new data and built the necessary automated topic classification and entity extraction systems. With public records data, Qbase technicians understand know how to process the billions of highly-interrelated information fragments because we have a true understanding of the data, its characteristics, and the advanced algorithms. And our team fully understands that public records (such as healthcare data) bring with them issues involved in privacy protection and legislative compliance, and we are prepared to deal with those issues. Qbase brings years of in-depth experience across multiple data domains when solving problems in diverse data sets. Let us help you solve your unique data problems.

Jan 24, 2010

How Qbase analyzes data

Posted by QbaseTechTeam at 10:27 AM Labels: data discovery, Qbase tools 0 comments

The user interface of our tool, Qbase Data Discovery™ (QDD™). In this case, the Field Analysis tab is shown. Click to view a larger, clearer image.

When we start data projects for clients, we use only our Qbase organic tools. That way, we don't need to rely on licensing or installing third-party software, or bringing anyone up to speed on external technologies — we just get to work and get results. We start by using our data discovery tool, Qbase Data Discovery™ (QDD™). QDD™ identifies the type of data in a field automatically, based on dynamically-created definitions for a particular data domain. QDD™ not only profiles the data, but QDD™ also exports the data and associated metadata for use in downstream processes. As part of the valid data definition process, we optionally present results of the discovery to the client to help identify data issues in real-time. Our tools and methods give Qbase a true advantage over our competitors who use slower and more expensive technologies. In other words, Qbase’s tools present real results faster, so you can start making inroads on data migration projects sooner, with less overhead, than any other system or supplier out there. Learn more about Qbase technologies at http://www.qbase.us/our_technology.html

Jan 22, 2010

More about data fusion...

Posted by QbaseTechTeam at 4:13 PM Labels: algorithms, data fusion, RAM 0 comments

In our last post, we were discussing how Qbase is able to handle data fusion in realtime. The secret is in how we use RAM (Random Access Memory). The indexes we create are highly compressed, but contain all required key fields. The in-memory indexes can then be distributed across a huge number of large memory servers. Scoring algorithms are all absolute, based on a ‘0-to-1’ continuum of values which allow queries to be efficiently fanned out to hundreds of servers and then collated back into a single ranked answer set. Qbase personnel have developed systems where data from thousands of sources were fused into a collection of approximately eight billion records, all stored in RAM. Response time averaged less than 0.25 seconds for core queries. Data was distributed across 150 large memory machines totaling nearly five terabytes of physical memory. In fact, Qbase projects involving geospatial imagery currently process and store over 200 terabytes of data and support expansion to 12 petabytes. Improve your decision response time and results with Qbase.

Jan 21, 2010

What is Qbase's approach to data fusion?

Posted by QbaseTechTeam at 1:20 PM Labels: data fusion 0 comments

We've had some questions recently on what the Qbase approach is to fusing disparate data sets together. Well, unlike some of our competitors, at Qbase our unique approach to data fusion is based completely on runtime matching and linking, which means there is no need for linking of any data in advance. What our technicians have done instead is index the data based on “loose keys.” The resulting queries bring back relatively large answer sets that are then scored, ranked, and rolled up into linked collections. There are a couple advantages to doing it this way. First, new data can be added to the system as soon as it’s available which can immediately participate in linked answer sets — there are no complicated linking processes to hinder data from being available as soon as possible. Second, run-time linking provides the ability to dial-in varying link strengths or to try different link algorithms without re-fabricating the entire data collection. Give us a call to discuss your challenges — we’ll convert them to successes.