Here we will try to cover the end to end design along with technology choices and high level architecture for building a food delivery app (that can be extended to create an ecommerce app). The major functional requirements that we will be covering are :
1. Restaurant On Boarding.
2. Food and Menu (Uploaded by Restaurant, Approved by App Admin)
3. Dynamic Pricing (Add delivery cost, discounts, promos, etc.)
4. Door to Door Delivery (Driver selection, continuous location update of order, etc.)
5. User Registration, Roles and Security
6. Search and Recommendations based on user activity, generated orders, user locations and ongoing promos.
7. Notification (Notify customer for every event).
8. Integration with Payment Gateway.
9. Reviews, Ratings and Surveys.
10. Order handling, cart and Checkout.
11. Customer Support and improvements.
12. Design that can be enhanced to support B2C E-Com Application.
13. Sales Report
14. Billing and Documents
1. System can support more than million orders and x10 search requests.
2. SLA for search should be <500 ms along with recommendations.
3. No data loss for transactional data.
1. Cloud platform : Hybrid cloud. AWS and native cloud (kubernetes) with direct connect between AWS and on-prim Data Centre. Cloud services used from AWS cloud are SQS, SNS, AWS Lambda, AWS ADM, VPC, etc. to support the notification system. For Detailed overview please refer notification app.
2. Kafka Streams : used to achieve high throughput in streaming events between read and write repositories of order management, sending real time or batch events to recommendation engine and publishing approved catalog and products(menu and foods) to elastic index.
3. RabbitMQ : To achieve transactions between distributed microservices and avoid data/transaction loss
4. Apache Spark/Beam : For running map-reduce jobs to select driver for efficient delivery and generate recommendations based on continuous streams of user activity, order activity, pricing and promo changes, etc.
5. Clickhouse DB : All the processed data will be stored in intermediate DB which can deliver good performance in high throughput(both read and write). Cassandra could have been a choice but because of its low read performance while reading the data from multiple SS tables in case of huge datasets didn’t make that a efficient candidate for the purpose. Aerospike could have been the second choice but as this is processed data and we are looking to store this in relational form to generate queries, we settled for this.
6. Aero Spike DB : To support high throughput in read and write in real time.
7. Redis Cache : To store the location and segments information of all the restaurants in the city. As this is very frequently read and non-changing data, we had chose to setup this in Redis to avoid multiple DB calls. For cart also, we will be storing all the selected items in Redis.
8. Elastic Search : Elastic indexes are used to store all published catalogs and products along with recommended items and placed order queries to achieve the targeted SLA for search and browse requests. Elastic search will also be used for indexing customer calls records and metadata for the transcription.
9. PostgresDB : To Store all transactional data. We have chosen PostgresDB to support native cloud approach but other options like public cloud managed storages can also be considered to achieve high scalability.
10. GraphQL : To achieve the targeted SLA, we have designed the system to support different repository for read and write, and have made the write repository more aligned towards domain data and read repository towards data that will be requested from UI. To do so, we have bundled all the product and order related data into their respective documents inside the elastic indexes. Now to decrease the load and send only relevant data to UI, we have used GraphQL query language to query selective data from the tree instead of complete data. Same goes for recommendation engine as well.
11. Web Sockets : Will be used to deliver the continuous update of driver location who had picked the delivery, to the customer.
12. Rest API(s) : Other the the cased specified above for web sockets and graphQL, all other request will be rest api(s).
13. MongoDB : To store review, ratings, notifications and surveys related data that doesn’t have a fixed schema and changed overtime.
14. Containers : Dockerized Spring Boot/Web flux based services.
15. Cloud Storage : for Blob and file storage storages like transcriptions, documents, reports, templates, etc.
16. NodeJS : Light weight API(s) exposed to perform quick and efficient IO operations and support scalable traffic with controlled memory. Java Reactive/Spring Webflux can also be considered as alternate
17. Map API : Map API will provide the city information and segments in which the restaurant is located. Also that can be used to determine the possible routes between restaurant from where the food is ordered and where the customer is located.
18. Config Server : For Externalization of properties/configurations.
19. API Gateway : Exposing the proxy API(s) of external systems. 3 different API gateways will be exposed with different level of security and rbac checks.
20. Kubernetes (Hosted on Public Cloud or On-Prim) : Kubernetes cluster will be deployed in On-prim system or over public cloud and all the container services will be hosted on the cluster. For Infra, terraform is preferred for inter portability of infra services and components between different clouds. Containers are docker based.
21. Service Mesh : for Inter service communication, certificate management, service discovery, version management and rate limiting (if required).
22. Load Balancer (Network and Service) : Network load balancer will be doing using the URL based routing to a service and service load balancer will be redirecting the network traffic to deployed service instances according to the configured mode(round robin or load based).
23. Network Firewall : External firewall placed in top of API gateway to secure internal infrastructure.
24. Active Directory : Will be storing all registered user credential and RBAC to manage authentication and authorization.
25. DevOps Tools : Terraform (infrastructure management), Vault (Secret Management), Jenkins (CI/CD), Docker, Kubernetes, AWS, Datadog and Prometheus (Monitoring and metrices), ELK/Splunk (logging)
26. Amazon Athena/Hive: To query the logs and transcriptions stored in the blob/cloud storage.
27. CDN : Akamai/ Amazon CDN for static content and cache
28. Rule Engine : For configuring pricing and transcription rules.