August 31, 2013

Insights and tips for working with Non-SQL Databases


The problem with Non-SQL DBs is that most developers doesn't really know how to use them on business grade solutions and they tend to keep all the "bad habits" and approaches of SQL... A big problem they face is when they want to make relations between data sets and they need a "direct dependence" like in SQL. I see the relational approach as one of the biggest SQL flaws and not the other way around, why?

Well, fisrt it makes the development to focus more on the data model due to it's complexity than the user experience and the UI needs, this is why systems tend to be complicated to use and providing very little flexibility for the user. Also, relational data doesn't always handle well with parallel data manipulation (due to the relations between tables). Software architecture must follow hardware architecture for best performance, and hardware its focusing on parallel processing, so does modern software, and this requires a parallel data-model.

Another problem with the relational approach is that since lots of things depends on others , it's very easy that something goes wrong and this usually ends with an error message or at least not being able to perform an operation.

Developers really need to learn how to use Non-SQL systems, JSON it's a great system for data administration, if you know how use it properly it can do things that developers and users alike only think as "utopias".

A simplified explanation on this "new" concepts of data-model:


Opposite to the horrible "normalized" data in SQL, here every object holds all the properties and sub-properties related to it (de-normalization/high data replication), the system only groups objects who share some specific property or properties when performing a query. This gives the system great flexibility and stability, even if there is data corruption, only a fraction of the data gets damaged and it's usually easy to "clean" and get back to normal, if this ever happens.

When you talk about de-normalizing data, everyone thinks the table will become HUGE, expensive and slow, but JSON it's actually very space-efficient, it's just text, comas and parentheses! and data gets formatted in a very efficient way compared to SQL, you will be amazed about how much data you can hold over 100mb. Also, smart systems allow data to grow without impacting performance significantly. Another thing that impacts the size of JSON is that it does not waste space stating dependencies and kinds. Also it is highly compressible (due to the high data replication) and therefore works great with data compression (Apps Script has special zip methods for this and most servers support or can be set to support it), use this when you need to benefit bandwidth over processing time.

Objects do not need to declare the kind or size of the data it holds, JSON doesn't care or need this information, again, this provides outstanding flexibility.

Let's use a Warehouse as an example of data stored in the DB, SQL  couldn't be worse for this, still, she's the queen of this realms, shame it's such an old queen not willing to catch up with the times, why? well, this lady forces everyone to her rules and expectations, and refuses to accept things she doesn't know and understand well in advance, also, she expects uniformity and it's a big fan of bureaucracy, her kingdom has been long and now everyone it's used to its ways, a different reality seems too utopic to really work and most of those who has tried to taste the Non-SQL kingdom, after a life of following the Queen's demands, they start feeling lost and keeps trying to emulate the Old-ways in the new realms, of coarse things go bad.

The real-life Warehouse holds objects of all kinds, very different from each other and they change over time, also new things comes with new characteristics and maybe it no longer uses some of the old ones. Even though they can be very different, they are all together and if peace it's to come to the warehouse, everyone must be treated as equal, so the little screw that is sold by weight (instead of pieces) and the refrigerator with a serial number can be stored next to each other and open the path to smart and universal search, were you type any property or name and simply get the matching result or results, and then decide what to do with what you found, also one expects to be able to do anything (reasonable). Note this also makes the universal search, the universal windows (to some extent).


Every object knows all its details, you know, like in real life... so it doesn't need to have it's information spread across different tables, habitats of the SQL kingdom are always trying to keep a different table for everything, group everyone according to some arbitrary attribute or relationship, but in the Non-SQL world, this is evil...possible, but evil, and when certain information changes, and you need to change other object properties in response , fire warnings or simply keep a record, most developers has problems when working over Non-SQL DB's, they want especial tables and fixed relations to handle this, but here you don't bend reality to fit your model, but the other way around... So, if the amount of screws changes or if the fridge got sold, there are separate "robots" (usually JSONP servlets) that are notified about this, AFTER the information on the table changes, this way, the user doesn't need to wait extra time for all the bureaucracy to be done, this robots updates a separate object values (different table, same table or even both), writes a record on a list (a special drive file for example), notifies an external server or do any kind of action, Google Apps Script its great for this, also this GAS robots bridge your App, the data model, the client and the DBs when needed.

Also, this robots are specific and works in parallel, each one knows who is authorized to use it and who doesn't, can keep track of who, for what and when it was called, without affecting it's performance. If one fails, it's easy to spot and return to a previous version while a fix is done, in the mean time, everything will usually keep running and the system will just be missing that part of the processing. Smart programming can handle to update everything else back to normal once the fixing it's done, all this with no or just partial downtime. Once the fixing it's done, you simply update the public version of the GAS robot, and that's it. This kind of things just doesn't happend in the SQL's kingdom so easily.

Also, this way it makes more sense to plan your eventual data consistency, and for this the key its to prioritize, to know were the data must be up-to-this-second and were it can be a little bit outdated. Don't get me wrong, all data can be live, it's possible to have 10,000 users working live over 10 Tb of data (have you seen Firebase yet?), but it will be expensive as hell... in this realm it's not about if it's possible, there are many ways to do a lot of things, the tricky part is actually to balance everything, in this case, live-data VS cost. My best tip, keep the reports and notifications in a non-live DB.


One way to keep your live-JSON-data clean is to make programmatic backups, and after the back-up, clean the live-data table from all objects that represents empty things (tip: this is were you must erase something when its amount reaches 0, or hits a null value, not before, for historical report convinience and to avoid errors in eventual data consistency handling).

This backups are best done by having a robot (GAS servlet) query all the JSON in the DB, stringify it and save it in a special Drive file, with the date as the name, in a specific Google Drive folder; another way (best for small/medium applications) its to dump the data inside a Google Docs Spreadsheet (exclusively or additionally).The appearance of this new file must trigger another robot that will search for the objects that needs to be cleaned, and erase them from the Live-Data. No-downtime or risk of loosing data that happened during the backup/clean process.

Tips for professionals: If you require fast-direct communication between your robots, with the possibility to "hold" specific data writing until someone has fished. Apps Script ScriptDB it's your solution. If you require an extra punch of GAS robot performance, sometimes Apps Script Cache Services hold the key. To import data, prefer REST and use Apps Script URL Fetch Service. To export data and communicate with traditional servers, use Apps Script JDBC Service and/or Apps Script XML Service. To export data to your HTML5 web app use Apps Script HTML Services or Apps Script Content Services. If all you care is price and scalability, go for Google Cloud Datastore JSON API, for live (but much more expensive) data go for Firebase or Google Drive Realtime API (here you can find a great github example of how to use the RT API in AngularJS, created by the awesome programmer +Steven Bazyl , and to handle the async data loading, AngularJS works great. Mix and balance things for best results.

But when this method really shines is when it's time to make historical analysis of data, so long it's done using the right tools of coarse, the very best way to do it is through BigQuery, it's capable of importing and processing terabytes of data in seconds and GAS can then pass the results to Google Charts to make amazing displays of data, you can easily compare how specific things behave over time, and it can be years of countless data. If you are using Spreadsheets to store your data, you can save a lot of trouble (and money) and use Fusion Tables instead to merge and analyze your data.


Well, first and most, very few people knows how to properly use this technologies, it's not what you find in common books, learn at school, or simply be able to ask your old teachers about, it's new stuff! Also, it's implementations, being based on open standards, works with so many things and in so many ways, that it depends a lot on the developing team skills and imagination to be able to balance cost, performance, compatibility and flexibility, most developers are used to be told how to do things, to follow standard procedures and follow guide-lines, this all applies here, but material is yet to be written, but this opens the path to those ready to push the limits and show what they can do, now more than ever developers can show they're skills and stand above even the rulers in the SQL kingdom.

The trend right now is to build convenient systems, that are accessible from any platform and users expect that the system adapts to their needs, now everyone asks that the software adapts to them and  to help them do things, hates needing to learn and adapt to the systems need and being forced to help it do what it was made for...

For all this (and so much more), a flexible web-based model like HTML5, with a flexible UI like CSS3, a cross-plattform code like Javascript, a CORS data-model like JSON and a Non-SQL back-end makes sense for the future of applications, and remember that your work-flow must be at least as flexible and responsive as your data and UI.

Thanks for reading and happy coding!

August 27, 2013

How to get all your Google Drive folders and URLs with GAS JSONP servlet




Using all the folders of a Google Drive account as a collaboration point over an external web app sounds pretty convenient under certain scenarios, especially for web-based Centralized Business Management.

Apps Script makes this easy with Drive Services and with very little effort you can make a GAS JSONP servlet that query's and packs all the folder names and URLs in a single JSON string, this is specially useful if you are trying to merge different accounts over a single interface. You can get pretty creative about how to implement this.

To make your own GAS JSONP servlet to handle your G.Drive file names & URLs, create a new Google Apps Script project and copy the following code:


Depending on what you need, you might add more data about the folders, this are the most relevant methods:

1.- getAccess(email): Check the users rights on this folder:

2.- getDateCreated(): To know the date when the folder was created.

3.- getDescription(): To get the folders description.

4.- getId(): This one it's particularly useful when working with JSONP, you can use the padding to tell the servlet to do something to a specific folder using the getFolderById(id) method.

5.- getFiles(): It's possible to add all files inside each folder in our JSON package, just add another While cycle but inside the folder's while cycle, this way you get all the folders name, URLs and with all it's files names and URLs, always use parallel loading when using this, in most cases the servlet will take some time to parse all this data.

6.- getSize(): Get's the number of bytes used to store this folder in Drive, in case you need to calculate loads.

Here is a very basic script to call the servlet from your HTML, for testing purposes the prefix it's set to "alert" so the data will be displayed on a pop-up as soon as it loads. For real-life applications change the "alert" prefix to the name of the Javascript function responsible for parsing the JSON, I also recommend
using AngularJS to handle parallel data loading when calling different servlets.


The HTML already includes the URL of a fully working servlet, feel free to use it for testing, but before you can use it, you need to grant access to the servlet, this is a 1 time requirement and it's done by manually calling the servlet's URL:

https://script.google.com/macros/s/AKfycbyixGlm9VJb9RuK_1ZYbsLrduuee4RX2v27mzKvtiFiTm5WaKRK/exec

Stay tuned for articles with more advanced uses of this approach, feel free to post your ideas, comments or suggestions.

Happy coding!

August 23, 2013

How to backup your FirebaseDB on Google Drive using Apps Script!




I've recently started using Firebase, and I've got to say that it's pretty awesome! Especially if you use AngularJS as your front-end you can start making live Apps in minutes, and thanks to Google Drive HTML hosting you don't even need to waste time looking for a place to host everything.

I really recommend trying the example "Wire up a Backend" found on the home page of the AngularJS project.You won't believe at first that "that was it", it's so fast and easy that I'm sure you will think that there is something else to do before it starts actually working.

Firebase can do a great job as a Live Data back-end, but it's not very appropriate for historical records for a few reasons:

* Storing static backups takes too much space, and it should be used wisely (the free 100 Mb quota can run out quickly if you are not careful).

* Downloading all the data every time you want to make a report consumes a lot of bandwidth quota, especially when you call many backups to make a time graph with Google Charts.

* It's best to save those precious connections, since those comes with a price too if you exceed the free 50 concurrent connections quota.

So, instead, it's best to save a backup of your precious Firebase Data on a safe place, Firebase already  has a button that let's you save a backup of the interface anytime you want... and that's good! but let's make it great by storing it automatically on a specific folder using the date as the file name. Google Drive's free quota it's more than large enough for millions of backups on most applications, no bandwidth limits and it can hold up to 10 requests per second/user, and they give 10,000,000 free requests a day... So feel free to let users query and make those cool graphs all they want from Drive.This way you use only 1 connection, for a very short time and you only need 1 download (usually once a day) to get a copy of your data.

So, how do you communicate both services? Well, using Firebase with Google Apps Script it's a piece of cake thanks to Firebase's REST APIApps Script URL Fetch Service and Apps Script DocsList Service, let's see how simple it is:


Since Firebase's REST API allows us to get a copy of the DB with a simple URL, we can use it to get all the information directly via URLFetch, in this example we simply store the text inside a specific Google Drive folder, but its also possible to parse and pass the data to ScriptDB, Spreadsheets, Charts, Gmail, etc.

So give it a try, you don't even need to make the example to try this, you can register to Firebase and manually create a DB and some dummy data to work with, Firebase Web interface it's extremely easy to use so you won't have any problems with this.

Promising huh? well, if you want to do some serious data processing, you can use Bigquery or Fusion tables to put all that data together and watch how it evolves through time, also, if anything ever goes wrong, simply upload your most recent backup from drive, Firebase makes this very easy giving an "Import JSON" option.

Looking for a way to communicate your brand-new and super-cool Firebase live DB with your client's "traditional" web server?, you can use Apps Script XML Service, Apps Script JDBC Service, or pass them to ScriptDB and to have a frozen copy that responds to GET or POST requests via Apps Script Content Service, this one it's very usefull to make Charts on web pages using Google Charts Javascript API and some JSONP.

Just keep in mind when you are working with Firebase to be very carefull with your quotas (they are meant to last a month), and use whatever technique you find appropriate to reduce data consumption.

Feel free to let us know your ideas and opinions about this platforms at the bottom of this article and stay tuned for more articles about this great technology combo.

Happy coding!

August 7, 2013

Ways to share anything with anyone with Gdrive public HTML hosting!


Looking for sharing your files with the world, but just sharing Gdocs isn't good enough for you? Don't worry! Google Drive has a special service called public HTML hosting, which allows you to transform any Gdrive folder into a virtual "Web Server"! When others access the page, they see a list with all the folders public content (by default everything), if they click on the file, if it's not something Gdrive can open directly (like images or docs) the users browser will simply download the file.

Let's see an example, this Gdrive Folder contains a .rar file, since Gdrive cannot open this kind of files directly, when you click on it, your browser will start downloading it:

https://googledrive.com/host/0B_RClkFMLkcpdXlOZWdwM2JuWUk/

You can even put it inside an iframe to show its contents directly on a website!



Sweet huh? The best is that the content keeps in sync with your Gdrive, so if you change your content, the public version will be instantly updated! You can even host webpages! just throw some index.html file on that folder and when the users open the folders URL, they will receive the content of index.html instead of the default folder view.

But that's not all you can do... if instead of using the folder's URL you point to a specific file, you can use the URL to share that specific file with someone, and when they open it, the file will start downloading automatically (except for docs, html and images)... but hey, that's not the cool part! you can use this specific URLs to stream multimedia content for you webpages! You don't even need to worry about a player, you can use HTML5 video or audio tags and show your content straight away!

This works even for large videos, here is a video of one of my cats (he's name is Volt) playing zelda:


Right on huh? But no, you cannot have my cat... but you can host your pet's video and share it with us in the comments section at the bottom of this article!

To make things easier, I also added a copy of a very nice HTML5 video converter for windows (there is a version for Mac OS users on the authors site), it does a great job getting your video ready for some HTML5 action.

Just remember, you are responsible of what you share on the internet, so be careful with how you use this tool, never distribute proprietary material like music, movies or videoclips.

Happy coding!