|Home||Back to Index|
Large number of concurrent activities
Continuous operation for years
Live updates and maintenance
Concurrency - common theme
Erlang processes are much more lightweight than OS threads
Reliability - what it gives you
Isolation - processes communicate only by message passing
Distribution - works across nodes
Linking/supervision/monitoring - one process takes action when another fails
Small language; few elements; functional - relatively easy to learn
Variables are immutable; no globals
Flow control via pattern matching; recursion
Processes, not mutexes, etc.
Selective receive lets you receive specific messages from anywhere in the message queue (even if other types are ahead of them)
OTP framework - everybody uses this
A distributed highly available eventually consistent highly scalable open source key-value database written primarily in Erlang.
Modeled after Amazon Dynamo. See Andy Gross’s 5 years later talk.
Also provides MapReduce, secondary indexes, full-text search
Built for operational ease - it just runs
Multiple clients - .NET, Java, Node.js, etc.
Consistent hashing - no sharding
Stores replicas - N/R/W values - adjustable
(The ring is very similar to the Security Now discussion of Tor.)
Vector clock - how it determines stale value - number of operations actors performed and a timestamp
All the nodes in a cluster are peers - no masters or slaves
Nodes exchange their understanding of ring state via gossip protocol
Riak uses the Erlang mesh for this
Can simulate multi-node installment on a single machine (nice for development)
At about 150 nodes, the cluster doesn’t scale well.
Control vs. Data
Distributed Erlang is good for control plane, not so good for data plane
Sending large data can block
Use TCP, UDP, etc. directly for data plane traffic
Don’t mix control plane and data plane traffic
Riak still does this in a few places, unfortunately (they’re going to fix this)
Fallback vnode holds data for unavailable primary vnode
Hands it off once primary becomes available
Vnode with stale data is repaired via asynchronous update
Active anti-entropy (AAE) - can actively repair stale value before it’s read
Monitoring is great for cleaning up after aborted operations.
Pattern-matching is an elegant way to parse binary data.
gen_fsm - one of the OTP library behaviors (finite state machine)
In Erlang, everybody uses these behaviors; makes for more readable code.
Let It Crash - Joe Armstrong’s doctoral thesis (he created Erlang)
Business logic goes in Workers; Supervisors are very simple and just start and watch Workers. Little can go wrong with Supervisors.
Erlang/OTP System Facilities
Get status of OTP process
Trace function calls, messages
Put a trace on a process - VERY powerful
Seldom need a debugger
Be careful - you can take a system down by improper tracing
Linking with C/C++
NIF - Native Implemented Function
Can ref count binaries
Portable interface to OS
Can crash the whole VM with a bad NIF
NIF calls execute within a VM scheduler thread - can block the thread
NIFs should only block for a millisecond or less
Put long-running activities in their own threads.
Eunit - unit testing
QuickCheck - create model of SUT, and it runs randomly-generated tests against it (like PEX?) - shrinks the test case for easier debugging - awesome
Watch your memory.
Hot code loading really works.
Understand the Erlang VM.
A DSL for distributed systems.
A Little Riak Book, Riak Handbook
Elixir - Ruby-like language on Erlang VM - also some Lisp-style languages
O’Reilly Erlang book is also very good. (Francesco Cesarini, Simon Thompson)
(And he gave me a free book for asking a question! - Bill S.)