http://104.197.73.116:80/Ghost 0.11Sat, 18 Mar 2017 18:54:21 GMT60Over the past year I’ve been revamping the project mentioned in my last post. Today, the architecture for the application has been split to a tiered approach with a backend API with front end and other backend services layered on top. The stack has shifted from Java + Python/Django]]>http://104.197.73.116:80/a-more-modern-web-architecture/398503db-6a19-47e8-9eb1-5e90863c9dfeWed, 22 Apr 2015 09:24:22 GMTOver the past year I’ve been revamping the project mentioned in my last post. Today, the architecture for the application has been split to a tiered approach with a backend API with front end and other backend services layered on top. The stack has shifted from Java + Python/Django to almost completely a Node.js based framework. I have also switched from using Apache to nginx mainly because I feel that it is much more straightforward to configure for multiple subdomains and applications. MySQL is still the primary data store, but I have added Redis for certain tasks that don’t need to be persistent, but need to happen fast. I’m sure usage of redis will grow as I find more use cases for it in my architecture. At this point the main system is comprised of three distinct applications. The API, Web, and Crawler applications.

The API and Web applications are built using the Hapi.js from Walmart Labs. It’s a great framework with lots of freedom around how you want to do things while bootstrapping the core functionalities of the web based apps. The main flow of information goes from clients’ browsers through the Web App, and then through the backend. This gives me a lot of opportunities to scale each piece individually as need arises. I have also switched to using Amazon S3 for storage of the listing images, of which there are currently around 1.5 million. NginX has a cache for these specific routes which are masked as a cdn subdomain off the main application domain. It’s pretty cool because this allows me to store virtually unlimited amounts of images in S3, while the most popular listings will always be retrieved from NginX on my own server after the first load. This saves on cost since I rarely contact S3 to retrieve images.

The crawler is a backend application built with Node.js as well. I feel that Node.js shines in this area since the bulk of time will be spent waiting on network connections from external real estate sources. The non blocking nature of Node.js combined with its single-threaded execution has yielded much less memory usage than its old Java counterpart. It also feels faster and more quick to complete the same tasks it was doing before. The crawler finds relevant information and uploads it to the API which the Web App uses to display to end users.

]]>Using a low memory budget virtual machine to run a scalable web app is turning out to be a fun challenge. To give a bit of context, I am two months from graduating Appalachian State University. I’m currently in a class where we are suppose to show off everything]]>http://104.197.73.116:80/web-apps-scalability-and-french-real-estate/2f939ff8-82d2-4ac4-b042-4935d8b34119Tue, 11 Mar 2014 20:11:25 GMTUsing a low memory budget virtual machine to run a scalable web app is turning out to be a fun challenge. To give a bit of context, I am two months from graduating Appalachian State University. I’m currently in a class where we are suppose to show off everything we learned in one cumulative project. I’ve chosen to build a web app for finding real estate in France.

The project has two sides, including a Django powered web platform running on Apache to display to users, as well as a backend java web crawler. The django side has been fairly good at keeping up with demands of the growing dataset. The java side however has been quite touchy. I’ve been running my jar file under these conditions.

While I’m not sure if that’s the most effective combination, it does keep my server under the guaranteed RAM limit of 1024MB. Accompanying this low memory JVM is a lot of memory overflows.

I found the JDBC driver for MySQL to be one of the biggest memory hogs. The real issue there was me not closing the resources when I was finished. Turns out for every Statement and ResultSet, they must be closed when finished or they will sit around longer than wanted.

Since discovering that, most of the memory issues have been alleviated. Now my limited 60GB HDD is running full of images. I’ll have to find a good solution for that within 3 weeks or it will be full up.