The Problem

Typically have average case, or worst case response time requirements.
Example: Worst Case response time <1 second for all requests.
Application performance drops with more users. Excess capacity, under load, bottleneck

Solution: Improve Hardware

benchmark/load test your system
- Apache Bench (ab)
- Apache JMeter
performance monitor each component : CPU, MEMORY, DISK, NETWORK
- Four Linux server monitoring tools

Get more RAM (access times in the nano seconds).
Better Disks (5400RPM vs 10000RPM, lower seek times, more cache)
RAID Redundant Array of Independent/Inexpensive Disks: for redundancy, durability and performance
Switch to SSD

HDD SSD

seek 4-12ms <.1ms

latency 2-5ms 0

transfer 125MB/s > 540MB/s

$ .08/GByte .60/GByte
- HDD VS SSD
- Latency is time between request and response. For disk, rotational latency is time between head on the right track and data rotating under read/wite head.
Caching: Copy higher use, most recently used data in higher speed devices. Search for data from higher speed to lower speed device.
Combine ram, disk, ssd via caching, either explicitly or with hybrid drives.
More network bandwidth
Faster, more CPUs (expensive vs more systems)
- Problem: Newer, faster CPUs replace speed with cores for performance systems must be parallelizable to take full advantage of these.
- Example: 12 cores/cpu at the high end
- Example: GPU thousands of simple cores This last example typically does not apply to web applications, more to numerical, rendering, decryption, ...

Make sure browsers are caching images and other static content
Serve up static content instead of dynamic content where possible
Pre compute mostly static web pages and serve them up instead of using php etc.
Minimize image, and other file sizes transferred to clients
Optimize algorithms in scripts (ie n log n sort vs n^2 sort, choose better data structures)
Consider time/space tradeoffs, so pre-compute and store useful information to speed up response time.
minimize requests to database, filesystem

nginx, nginx architecture
node.js. Replacement for standard web server (http), Javascript application server, good for web sockets.
or other alternatives to apache

SQL Relational DB provides: Atomicity, Consistency, Isolation, Durability
- Schema level optimizations: Create indexes, optimize queries, denormalize data
- Move Database off web server
- DB level optimizations: replication (mirroring, typically master/slave)
Memcached (replace db with RAM, loose durability, query language, low level consistency only).
NoSQL DB options: Not ACID compliant, typically loose consistency, replaced with eventual consistency.
- mongodb
  - Java object datastore. Each instance is stored atomically. Not like SQL where data is spread among tables.
  - sharding by range or by hashing. Each shard is replicated.
  - replication primary and two secondaries (typical) primary is for writing and reading, secondary is for reading only. E
- Apache Cassandra (under Hadoop project)
- others...

Application is designed in such a way as to allow substantial growth. Typically, organized so that, to handle more work, add more workers (whatever work and workers means).
Linear scalability: n X workers = n X performance
- Only works if we can parallelize the work.
- Examples:
  - Each drive = 1 TByte, add n drives for n times the space.
  - Each server handles 10,000 requests/second. Two servers handle 20,000 requests/second. Three handle 30,000 requests/second. n servers handle n*10,000 requests/second. To do this, need a scalable architecture.
  - Sequential vs parallelizable tasks:
    - 1 baker takes 1 hour to bake 1 cake.
    - 10 bakers take 1 hour to bake 10 cakes.
    - 10 bakers take 1/10 hour to bake 1 cake?
In CS Theory: Question NC=P?
Platforms for scalable computing
- Virtual Servers/Cloud Computing AWS, Google Compute Engine
- Google Map Reduce
- Hadoop The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing.
- Apache Spark