The Problem
- Typically have average case, or worst case response time requirements.
- Example: Worst Case response time <1 second for all requests.
- Application performance drops with more users.
Excess capacity, under load, bottleneck
Solution: Improve Hardware
First steps: understand the issue, find the bottleneck
- benchmark/load test your system
- performance monitor each component : CPU, MEMORY, DISK, NETWORK
Now spend $
- Get more RAM (access times in the nano seconds).
- Better Disks (5400RPM vs 10000RPM, lower seek times, more cache)
- RAID
Redundant Array of Independent/Inexpensive Disks: for redundancy, durability and performance
- Switch to SSD
| HDD | SSD |
seek | 4-12ms | <.1ms |
latency | 2-5ms | 0 |
transfer | 125MB/s | > 540MB/s |
$ | .08/GByte | .60/GByte |
- HDD VS SSD
- Latency is time between request and response. For disk, rotational latency is time
between head on the right track and data rotating under read/wite head.
- Caching: Copy higher use, most recently used
data in higher speed devices. Search for data from higher speed to lower speed device.
- Combine ram, disk, ssd via caching, either explicitly or with hybrid drives.
- More network bandwidth
- Faster, more CPUs (expensive vs more systems)
- Problem: Newer, faster CPUs replace speed with cores for performance
systems must be parallelizable to take full advantage of these.
- Example: 12 cores/cpu at the high end
- Example: GPU thousands of simple cores
This last example typically does not apply to web applications, more to numerical, rendering, decryption, ...
Solution: Application optimizations
- Make sure browsers are caching images and other static content
- Serve up static content instead of dynamic content where possible
- Pre compute mostly static web pages and serve them up instead of using php etc.
- Minimize image, and other file sizes transferred to clients
- Optimize algorithms in scripts (ie n log n sort vs n^2 sort, choose better data structures)
- Consider time/space tradeoffs, so pre-compute and store useful information to speed up response time.
- minimize requests to database, filesystem
Solution: Apache optimizations
Solution: Web Server alternatives
- nginx, nginx architecture
- node.js. Replacement for standard web server (http), Javascript application server, good for web sockets.
- or other alternatives to apache
Database optimizations
- SQL Relational DB provides: Atomicity, Consistency, Isolation, Durability
- Schema level optimizations: Create indexes, optimize queries, denormalize data
- Move Database off web server
- DB level optimizations: replication (mirroring, typically master/slave)
- Memcached (replace db with RAM, loose durability, query language, low level consistency only).
- NoSQL DB options: Not ACID compliant, typically loose consistency, replaced with eventual consistency.
- mongodb
- Java object datastore. Each instance is stored atomically. Not like SQL where data is spread among tables.
- sharding by range or by hashing.
Each shard is replicated.
- replication primary and two secondaries (typical)
primary is for writing and reading, secondary is for reading only. E
- Apache Cassandra (under Hadoop project)
- others...
Goal: Scalability
- Application is designed in such a way as to allow substantial growth. Typically, organized
so that, to handle more work, add more workers (whatever work and workers means).
- Linear scalability: n X workers = n X performance
- Only works if we can parallelize the work.
- Examples:
- Each drive = 1 TByte, add n drives for n times the space.
- Each server handles 10,000 requests/second. Two servers handle 20,000 requests/second.
Three handle 30,000 requests/second. n servers handle n*10,000 requests/second.
To do this, need a scalable architecture.
- Sequential vs parallelizable tasks:
- 1 baker takes 1 hour to bake 1 cake.
- 10 bakers take 1 hour to bake 10 cakes.
- 10 bakers take 1/10 hour to bake 1 cake?
- In CS Theory: Question NC=P?
- Platforms for scalable computing