Unstructured data is defined as: information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information comes in many unrelated forms such as documents, text files, spreadsheets, presentation files, video files, audio files, and pictures to name some of the more common formats. The Computer World magazine states that unstructured information might account for more than 70%–80% of all data in organizations. Typically, this data is stored in individual files without an organizational application organizing and managing the relationships between these files.
Why is this Important?
Simple – most data growth (over 80% ) comes from unstructured data. In this digital age where cell phones and remote devices of all kinds are capturing and storing information for later analysis, the digital data explosion is exponential. So, managing this kind of data is critical.
The Main Challenges with Managing Unstructured Data
The three main challenges are:
- Cost Efficiency
Growth – Having a place to store this data and storing the data efficiently is critical. This means using data reduction technologies like compression and data deduplication to insure the information is stored using the minimum amount of space – and then being able to track and forecast storage growth for this type of information. Managing growth means tracking storage utilization and growth trends so you can plan for storage expansion based on these trends.
Retrieval/Usage – It is not only important to track storage usage trends, but also what data(files) are being used/accessed on a regular basis. Many organizations keep all their data on high performance storage and more than 90% of this information is never accessed – costing organizations millions of dollars in unnecessary IT spend. This unnecessary spend could be avoided If you just know which files are needed for rapid retrieval and which can be stored in very low cost archival storage.
Cost Efficiency – means storing the information in the most cost-effective manner to meet business application requirements while also meeting IT budget requirements.