Sunday, 21 August 2016

TL;DR The Google Summer of Code period ends, and am glad that I am able to meet all the goals and develop something productive for the Drupal community. In this blog post, I will be sharing the details of the project, the functionality of the module and its current status.

I am glad that I was one of the lucky students who were selected to be a part of the Google Summer of Code 2016 program for the project “Integrate Google Cloud Vision API to Drupal 8”. The project was under the mentorship of Naveen Valecha, Christian López Espínola and Eugene Ilyin. Under their mentoring and guidance, I am able meet all the goals and develop something productive for the Drupal community.

Google Cloud Vision API bring to picture the automated content analysis of the images. The API can not only detect objects ranging from animals to famous monuments, but also detects faces on emotions. In addition, the API can also help censor images, extract text from images, detect logos and landmarks, and even the attributes of the image itself, for instance the dominant color in the image. Thus, it can serve as a powerful content analysis tool for images.

Now let us see how can we put the module to use, i.e. what are its use cases. To start with, the Google Vision API module allows Taxonomy tagging of image files using Label Detection.

Label Detection classifies the images into a number of general purpose categories. For example, classifying a war scenario to war, troop, soldiers, transport, etc. based on the surroundings in the images. This feature of the module is especially important to filter the images based on some tags.

Second feature listing our use case is the Safe Search Detection. It quickly identifies and detects the presence of any explicit or violent contents in an image which are not fit for display.

When this feature is enabled in the module, the Safe Search technique validates any image for explicit/violent contents. If found, these images are asked for moderation, and are not allowed to be uploaded on the site, thus keeping the site clean.

Please click here for video demonstration of the two above-mentioned use cases.

Continuing with the other use cases, the third one is Filling the Alternate Text field of an image file.

Label, Logo, Landmark and Optical Character Detection feature of the Google Cloud Vision API have been used to implement this use case. Based on the choice entered by the end user, he/she can have the Alternate Text for any image auto filled by one of the four above-mentioned options. The choice “Label Detection” would fill the field with the first value returned in the API response. “Logo Detection” identifies the logos of famous brands, and can be used to fill the field accordingly. Likewise, “Landmark Detection” identifies the monuments and structures, ranging from natural to man-made; and “Optical Character Detection” detects and identifies the texts within an image, and fills the Alternate Text field accordingly.

Next comes the User Emotion Detection feature.

This feature is especially important in cases of new account creation. On enabling this feature, it would detect the emotion of the user in the profile picture and notify the new user if he/she seems to be unhappy in the image, prompting them to upload a happy one.

Lastly, the module also allows Displaying the similar image files.

Based on the dominant color component (Red, Green or Blue), the module quickly groups all the images which share the same color component, and display them under the “Similar Content” tab in the form of a list. Each item links itself to the image file itself, and is named as per the filename saved by the user.

Users should note here that by “similar contents”, we do not mean that the images would resemble each other always. Instead we mean here the same dominant color components.

All the details of my work, the interesting facts and features have been shared on the Drupal Planet.

Please watch this video to know more on how to use the above-mentioned use cases in proper way.

This is the complete picture of the Google Vision API module developed during the Google Summer of Code phase (May 23, 2016- August 23, 2016).

Let me now share what my project is based on, and what were the tasks I had proposed to complete.

Google Cloud Vision API bring to picture the automated content analysis of the images. The API can not only detect objects ranging from animals to famous monuments, but also detects faces on emotions. In addition, the API can also help censor images, extract text from images, detect logos and landmarks, and even the attributes of the image itself, for instance the dominant color in the image.

All the features which I had proposed to implement are listed below:

Integrate the Label Detection feature with the image field.

Integrate the Landmark Detection​ with the image field.

Integrate the Logo Detection​ with the image field.

Integrate the Explicit Content Detection ​with the image field.

Integrate the Optical Character Recognition​ with the image field.

Integrate the Face Detection​ with the image field.

Integrate the Image Attributes ​with the image field.

On discussion with my mentors, we had decided the following use cases to implement the above proposed features. We had put to use these features in the following way:

Use the Label, Landmark, Logo and Optical Character Detection to fill the Alternate Text field of the image files uploaded by the user.

Use the Explicit Content Detection feature to identify and detect any explicit or violent content present in the images, and prevent the uploading of such content.

Use the Face Detection feature to detect the emotions of the users in their profile pictures, and notify the users if they seem to be unhappy.

Use the Image Attributes feature to detect the dominant color in the image, and group the images files on its basis.

In addition to these implementations, I worked on developing tests to test the functionality of the module, implementing the important concepts of Drupal 8, such as, use of services and containers and use of abstract parent classes for the tests.

I made the contributions to Drupal in the form of a module under the name Google Vision API.

All the contributions were made under the guidance and surveillance of my mentors, Naveen Valecha, Christian López Espínola and Eugene Ilyin and have been committed to the module only when they permitted.

In order to share the weekly details of the project with the Drupal community, I maintained blog posts on Drupal Planet, where I shared my work experiences, the tasks which I have accomplished, the issues or the problems I faced along with the solutions. Please click here to read all the blog posts.

This is the complete picture of my codes and contributions during the Google Summer of Code period (May 23, 2016- August 23, 2016).

Tuesday, 16 August 2016

TL;DR Last week I had worked moving the helper functions for filling Alt Text of image file to a new service; and moving the reused/supporting functions of the tests to an abstract parent class, GoogleVisionTestBase. This week I have worked on improving the documentation of the module and making the label detection results configurable.

With all major issues and features committed to the module, this week I worked on few minor issues, including the documentation and cleanup in the project..

It is an immense pleasure for me that I am getting the feedbacks from the community on theGoogle Vision API module. An issue Improve documentation for helper functions was created to develop more on documentation and provide the minute details on the code. I have worked on it, and added more documentation to the helper functions so that they can be understood better.

In addition, a need was felt to let the number of results obtained from the Vision API for each of the feature as configurable, and allow the end user to take the control on that. The corresponding issue is Make max results for Label Detection configurable. In my humble opinion, most of the feature implementations and requests to the Google Cloud Vision API have nothing to do with allowing the end user to configure the number of results. For instance, the Safe Search Detection feature detects and avoids the explicit contents to be uploaded, and does not need the number of results to be configurable. However, the taxonomy tagging using Label Detection should be user dependent, and hence, I worked on the issue to make the value configurable only for Label Detection purpose. This value can be configured from the Google Vision settings page, where we set the API key. I have also developed simple web tests to verify that the value is configurable. Presently, the issue is under review.

I have also worked on standard coding fixes and pa-reviews and assisted my mentor, Naveen Valecha to develop interfaces for the services. I assisted him on access rights of the functions, and fixing the documentation issues which clashed with the present one.

Lastly, I worked on improving the README and the module page to include all the new information and instructions implemented during the Google Summer of Code phase.

With all these works done, and all the minor issues resolved, I believe that the module is ready for usage with all the features and end user cases implemented.

Next Week, I’ll work on creating a video demonstration on how to use Google Vision API to fill the Alt Text attribute of an image file, detect the emotion in the user profile pictures and to group the similar images which share the same dominant color.

Tuesday, 9 August 2016

TL;DR Last week I had worked on modifying the tests for “Fill Alt Text”, “Emotion Detection” and “Image Properties” features of the Google Vision API module. The only tasks left are moving the supporting functions to a separate service, in addition to, creating an abstract parent class for tests and moving the functions there.

There are few supporting functions, namely, google_vision_set_alt_text() and google_vision_edit_alt_text() to fill the Alt Text in accordance to the feature requested from the Vision API, and also to manipulate the value, if needed. I moved these functions to a separate service, namely, FillAltText, and have altered the code to use the functions from there instead of directly accessing them.

In addition, there are a number of supporting functions used in the simple web tests of the module, to create users, contents and fields, which were placed in the test file itself, which in one way, is a kind of redundancy. Hence, I moved all these supporting functions to abstract parent class named GoogleVisionTestBase, and altered the test classes to extend the parent class instead and in place of WebTestBase. This removed the redundant code, as well as, gave a proper structure and orientation to the web tests.

These minor changes would be committed to the module directly, once the major issues are reviewed by my mentors and committed to the module.

Wednesday, 3 August 2016

TL;DR Last week, I had worked on and developed tests to ensure that the Alt Text field of an image file gets filled in accordance to the various detection features of the Vision API, namely Label Detection, Landmark Detection, Logo Detection and Optical Character Detection. This week I have worked to modify and add tests to various features of the Google Vision module, namely filling of Alt Text field, emotion detection of user pictures and grouping the image files on the basis of their dominant color component.

My mentors reviewed the code and the tests which I had put for review to get them committed to the Google Vision API module. However, the code needs some amendment pointed out by my mentors, which was to be corrected before commit. Hence, I spent this week working on the issues and resolving the flaws, rather than starting with a new feature.

Let me start discussing my work in detail.

I had submitted the code and the tests which ensure that the Alt Text field gets properly filled using various detection features according to the end user choice. However, as was pointed out by my mentor, it had one drawback- the user would not be able to manipulate or change the value of the field if he wishes to. Amidst the different options available to the end user to fill the alt text field of the image file, there was a small bug- once an option is selected, it was possible to switch between the options, however, disabling it was not working. After, been pointed out, I worked on modifying the feature and introducing the end user ability to manipulate the value of the field as and when required. Also, I worked on the second bug, and resolved the issues of disabling the feature.

Regarding the Emotion Detection(Face Detection) feature of the Vision API, I was guided to use injections instead of using the static methods directly, and to modify variables. For example, the use of get(‘entity_type.manager’) over the static call \Drupal::entityTypeManager(). Apart from these minor changes, a major issue was the feature was being called whenever an image file is associated with. However, I need to direct it to focus only when the user uploads an image, and not on its removal (as both the actions involves an image file, hence the bug).

In the issue, Implementation of Image Properties feature in the Vision API, I had queried multiple times to the database in the cycle to fetch results and build the routed page using the controllers. However, my mentor instructed me that its really a bad way of implementing the database queries to fetch the results. Hence, I modified the code and changed them to single queries to fetch the result and use them to build the page. In addition, I was asked to build the list using ‘item_list’ instead of using the conventional ‘#prefix’ and ‘#suffix’ to generate the list. Another important change in my approach towards my code was the use of db_query(), the use of which is deprecated. Hence, I switched to use addExpressions() instead of db_query().

Presently, the code is under review by the mentors. I will work further on them, once they get reviewed and I get further instructions on it.