Good
- The ability to provision huge databases as needed, without going through a costly and slow procurement process to obtain the hardware and software
- The ability to scale to handle huge databases, perhaps well beyond the petabyte range
- The potential to use an elastic set of resources to return result sets with enough speed to be actually relevant when operating a business
- The potential to save huge amounts of money over the years versus the cost of using your own hardware and software
- Built and optimized for doing aggregation queries over large sets of data. When we want to answer a question with Redshift, we just write a SQL query and get an answer within a few minutes—if not seconds.
- From a user perspective , any one of our developers can write a SQL query and they have an answer to their question in less than 5 minutes. Moving from even a Hadoop based workflow to an interactive console session with Redshift is a major improvement. I assume moving from RDBMS to this would definitely be a big one !!
- Additionally, since much of the user facing bits of Redshift are based on PostgreSQL there is a large ecosystem of mature, well-documented tools and libraries for us to take advantage of.
- Impressive web management console Amazon provides with Redshift. For a 1.0 product, the console is comprehensive and offers much more information than we expected it to.
Bad
- The possibility of outages; it’s not that your internal data warehouse does not go down at times, but any failures will be public and give cloud computing a black eye internally
- The costs of data migration and integration; in many instances, you’ll need huge amounts of bandwidth to transmit the data from internal systems to the cloud-hosted Redshift, or you’ll be shipping USB drives via FedEx to Amazon Web Services
- A lack of best practices; we just started with public cloud-hosted data warehouses and clearly have some things to learn
- The possibility of higher costs; although many organizations will find cost savings with cloud-hosted databases such as Redshift, many will discover that their cloud computing bill is much higher than anticipated — perhaps exceeding the cost of an on-premise database
- Security issues with public cloud and data leakages to that effect
- Depending on from which instance this is accessed it will be through internet traffic as opposed to MPLS hence it can be slow
- Write performance of Amazon Redshift is relatively low compared to „classical“ relational databases (in your data center) as you have to upload all data into the cloud
- High Availability
- Full table scan
- Data Loading
Very Good Reference:
http://word.bitly.com/post/48854093418/speeding-things-up-with-redshift
Summary from above:
Can speed up and expand our ad hoc data analysis.
Leave a Reply