Pagination

GirlScript Foundation is a Non-profit registered by Government of India, has come up with a unique and India's first women-oriented technical festival. Girlscript Summer Of Code (GSSoC) is a 3-month long open source project under Girlscript India. Thanks, GSSoC for selecting my project Openchat. As I am a Google Summer of Code student, so I will be working there whole time and it's my first priority and being the admin of the project, I can help assigned Mentors in case of any knowledge related to project and permissions to the repository.

About the Project- Openchat

Openchat, as its name suggest, is a chatting/messaging platform where people can send messages to anyone registered to this platform. Openchat is my summer's project which I have developed in my 1st year while learning PHP. I tried my best to follow all the code quality standards and documentation work so that it can be readable and understandable to new folks and hope the same from you guys.

Contributing to project

- I think the first step in contributing to any project is to start with setting the project on your local system so I advice all the interested folks to do the same. I have added instructions in README to set up the project on your local.

- Once the project is set up and working well on your system, Try to get familiar with the code by running the project, finding small issues and bugs. In this way, you will better understand the flow of the project and come up with great ideas and suggestions.

- I have opened a lot of issues in the Issue section of the project and marked them with the level of difficulties which can help you in picking up the issues. Anyone who is interested in working on any issue will comment on the issue itself so other people will get to know that someone is already working on the issue. I really appreciate if you people will ask problems related to any issue in the issue comment's section, This will help other developers who are facing the similar problems.

- Fork the project and create a new branch before start working on any issue and make a pull request. I will review and merge the PR as soon as possible, if I take more time in reviewing the PR, you can ping me on twitter @ankitjain28may and IRC @ankitjain28 (Most active on Twitter and IRC) or Slack Channel. You can also mail me, my email is - ankitjain28may77@gmail.com (28 may is my dob but I am not that old, google wanted to add 77 to make it unique).

- All the very best to all the participants, hope to collaborate with some great developers and help new folks in starting their beautiful journey with Open Source.

Guide for beginners

GSSoC is doing a great work by encouraging people so I want you people to get familiar with Git and Github before start working on the project. You should better understand the version control and we mentors are here to guide and help you throughout the GSSoC period and even after if anyone wants.

It would be best if you make your system more developer friendly by installing Sublime or some IDE and other things, Here I am sharing some resources which can help you.

About Me

I am a final year (not yet waiting for my 3yr results :p ) Computer Science Student from Noida. I am selected in Google Summer of Code this year and working with Drupal. I am an Open Source Contributor and web developers. More you can know about me in the "About Me" at the top of the page and through my LinkedIn profile. You can also contribute to any of my open source projects. Link to Github Profile - ankitjain28may

This news that Microsoft has acquired Github, has trembled the developer community. No one has ever expected this and my perspective was the same, I really want that Github should remain an independent entity on which we developers can trust. But the scenario has completely changed, Microsoft has acquired Github in $7.5 billion, a home of millions of Developers.

Developer's Reaction

Due to a bad history of Microsoft, Developers find it difficult to adopt this change and they are thinking that Microsoft will ruin Github too as they did with other products like Nokia etc. Many developers are moving to Github, a rival service #movetogitlab. Many developers are flowing with the flow and some are still deciding what they should have to do. I still think that Microsoft's acquisition of GitHub can empower developers. I like the User Interface of Github so I would like to stay with Github.

Reason of acquisition

Microsoft has acquired Skype, Minecraft, LinkedIn and none of these is a failure but they are developed further. Microsoft has changed quite a lot and now it is the world's largest open source contributor (according to Github). Github is the home of 24 million users collaborating on code with over 80 million repositories, Microsoft doesn't want to acquire their code but they want to provide developers more facilities. Microsoft's project i.e VS Code is loved by millions of developers, entirely open source, and built using GitHub’s Electron platform.

Microsoft is trying to get the attention of developers and they are thinking to empower Github with Microsoft Azure, cloud computing service which is a good alternative to Amazon Web Services. They have mentioned to include the Microsoft IT technologies and services to Github. Microsoft mentioned in the blog post-

GitHub will remain an open platform, which any developer can plug into and extend.

Developers will continue to be able to use the programming languages, tools and

operating systems of their choice for their projects – and will still be able to deploy their

code on any cloud and any device.

Microsoft appeals to the developer that we are contributing to open source and are world largest open source contributor and some of our popular frameworks and projects are also opensource. So When it comes to our commitment to open source, judge us by the actions we have taken in the recent past, our actions today, and in the future.

Nat Friedman, Github new CEO has beautifully written in his blog "I’m not asking for your trust, but I’m committed to earning it."

I don't know what's the master plan of Microsoft behind the acquisition of Github but I really hope that it will all for empowering developers. Thanks for reading the article. Let me know your views in the comment section.

Week 3 of the GSoC coding period is completed successfully. GSoC (Google Summer of Code) is a global program focused on bringing more student developers into open source software development. Students work with an open source organization on a 3-month programming project during their break from school.

Project Abstract

I am working on "Developing a “ Product Advertising API ” module for Drupal 8" - #7. The “Product Advertising API” which is renamed to "Affiliates Connect" module provides an interface to easily integrate with affiliate APIs or product advertising APIs provided by different e-commerce platforms like Flipkart, Amazon, eBay etc to fetch data easily from their database and import them into Drupal to monetize your website by advertising their products. If the e-commerce platforms don't provide affiliate APIs, we will scrape the data from such platforms.

Progress

Some of the tasks accomplished in this week are -

- Configuration Form for saving the configuration of the affiliates_connect settings is completed. Link to the issue - #2976037

- Custom Affiliates Product Entity for storing product's data from various vendors is almost completed and is reviewed by borisson_ , dbjpanda, and other mentors. Link to the issue - #2975642

- As every vendor has a different configuration so configuration form for the plugins is still under development and discussion. Link to the issue - #2977044

- Functional Tests for verification of routes defined in the project as suggested by borisson_ is also completed and under review. It also included the functional tests for checking whether product's data is submitted correctly by affiliates_product add form or not. Tests for deleting & editing the products are also completed under this issue. Link to the issue - #2977377

Tests can increase the velocity without doing too much extra work.

Week 4 - Goals

The basic module development is completed so I will start working on Scraper API and I have done some work this week by doing/implementing some basic level of scraping using various libraries/modules in Node. Link to the repo - scraping-using-node. I have done some studies on Flipkart Affiliate APIs in this repo so I am also working on Flipkart Plugin. I will update my further work in the repo.

Flow for scraping any e-commerce website content is -

1. The first task is to find the sitemap URL which actually contains the categories URL. Sitemap URLs for Flipkart and Amazon - Flipkart, Amazon.in

2. After the categories of the products, We can scrape the product category wise and paginate each category till the last page to scrape every product so we need some way to paginate the whole category.

3. We need to scrape the detailed product data so we need to go to each product link and scrape its content.

4. Saving all product's link to a file for further scraping/updating existing product.

I have decided to scrape these sites using x-ray in Node and I found x-ray perfect for this project because of the features provided by this lib like Pagination feature, Crawler support, and Pluggable drivers- phantomjs which are required to scrape the websites which use React/Angular to load their content. While working on the x-ray, I found that the pagination feature is not working as expected and created lots of issues. I along with shibasisp and other mentors went through many libraries like Casper, Osmosis,Webdriver.io, and nightmare. After trying many libs, Nightmare is the one that is found suitable for this project. Nightmare uses Electron under the cover, which is similar to PhantomJS but roughly twice as fast and more modern. Nightmare uses Javascript for handing/manipulating DOM through evaluate function which is complex to implement. So, I am using Cheerio for handling/fetching DOM content by fetching innerHTML through evaluate function and pass the content (innerHTML) to Cheerio which is easy, fast, and flexible to implement.

Difficulties

- We need to scrape thousand of products data so it can take a lot of time depending on the implementation so I need to devise the algorithm that takes minimal time in scraping the whole lot of data.

Week 2 of the GSoC coding period is completed successfully. GSoC (Google Summer of Code) is a global program focused on bringing more student developers into open source software development. Students work with an open source organization on a 3-month programming project during their break from school.

Project Abstract

I am working on "Developing a “ Product Advertising API ” module for Drupal 8" - #7. The “Product Advertising API” which is renamed to "Affiliates Connect" module provides an interface to easily integrate with affiliate APIs or product advertising APIs provided by different e-commerce platforms like Flipkart, Amazon, eBay etc to fetch data easily from their database and import them into Drupal to monetize your website by advertising their products. If the e-commerce platforms don't provide affiliate APIs, we will scrape the data from such platforms.

- With the skeleton issue, The overview page which will show the different plugins enabled by the user is also completed.

- It will also show the fetcher status as shown in the above-attached image.

- Configuration Form for saving the configuration of the affiliates_connect settings is almost completed and is under review. Link to the issue - #2976037

- Custom Project Entity for storing product's data from various vendors still needs some work and will be completed within this week and link to the issue - #2975642

Week 3 - Goals

In this week the basic module for developing/integrating the sub-modules will be completed and I will start working on the Scraper API which will be developed using Node.js and npm packages.

- As every vendor has a different configuration so configuration form will be added for the sub-modules.

- The common configuration will be inherited from the parent module which is completed in this issue - #2976037

- Start working on Scraper API for sub-modules.

Difficulties

- Most of the websites are using latest Front-end Frameworks like Angular, React etc so to fetch dynamic content from such websites, we need to use a Headless browser for which I am using x-ray in Node.

Week 1 of the GSoC coding period is completed successfully. GSoC (Google Summer of Code) is a global program focused on bringing more student developers into open source software development. Students work with an open source organization on a 3-month programming project during their break from school.

Project Abstract

I am working on "Developing a “ Product Advertising API ” module for Drupal 8" - #7. The “Product Advertising API” which is renamed to "Affiliates Connect" module provides an interface to easily integrate with affiliate APIs or product advertising APIs provided by different e-commerce platforms like Flipkart, Amazon, eBay etc to fetch data easily from their database and import them into Drupal to monetize your website by advertising their products. If the e-commerce platforms don't provide affiliate APIs, we will scrape the data from such platforms.

Progress

I am enjoying working on this project and learning new things. The module development progress and Code can be checked at this link - Affiliates-Connect. Some of the tasks that were accomplished in the week - 1 are -

- The basic skeleton of the module is ready.

- As the product's data is a common entity between all the vendors so the custom content entity that is needed to save the product's data is completed.

- Configuration Form for saving the configuration of the affiliates_connect settings is completed.

- Basic plugin manager is added to handle plugins that will be provided as the sub-module to this module.

As this module is similar to Social Auth so I took help of gvso to understand the working of the social auth and its various implementers like social_auth_facebook and social_auth_google. With his help, the project is more crystal clear to me and mentors are there to guide and review our code at each step.

Week 2 - Goals

In this week the basic module for developing/integrating the sub-modules will be completed and I will start working on the Scraper API which will be developed using Node.js and npm packages.

- As every vendor has a different configuration so configuration form will be added for the sub-modules.

- The common configuration will be inherited from the parent module.

- Plugin Manager with all the required functionalities will be completed.

- Start working on Scraper API for sub-modules.

Difficulties

- Plugin Manager functionality needs to be defined accurately and precisely so that we can better manage the plugins and inherit common functionalities from the parent module.

- Most of the websites are using latest Front-end Frameworks like Angular, React e.t.c so to fetch dynamic content from such websites, we need to use a Headless browser for which I am using x-ray in Node.

Coding Period is started and I am very excited to start working on my project - Affiliates Connect. This module provides an interface to easily integrate with various affiliate APIs or product advertising APIs provided by different e-commerce platforms like Flipkart, Amazon, eBay etc to fetch data easily from their stores and display those to monetize your website by advertising their products.

Project Roadmap

This project is divided into three phases -

1st phase deliverables -

Fully functional affiliates_connect module with all required tests and API documentation for developers.

2nd phase deliverables -

3rd phase deliverables -

1. Cloaking and hit analysis feature to let site builder checks his marketing/advertising reports.
2. Form widget browser to let site builder creates advertising content.
3. Product Search Page for side by side comparison of products from different vendors.
4. A Drupal distribution of product compare engine
5. Documentation and screencast explaining how to use this module.

Project management

Project management is done through project issue queue, I will create an issue for each task and will attach the patches so that mentors can easily review my work and get to know my progress.

Progress

This module has two main functionalities i.e -

1. It provides developers to leverage its functionalities by providing their own plugins e.g. affiliate_connect_example module to fetch data from example website.

2. It provides scraper as a fallback fetcher

In this week, I am working on creating database schemas for storing the product's metadata and user's configuration for various modules/plugins. A schema is designed using Custom entity.

After finishing the database schema, I will start working on the scraper API. Scraper API basically scrape the data from various e-commerce sites. For scraping, I am using Node.js and provide REST API for each of the scraping functions. HTTP request will be sent from Drupal through cURL or PHP Guzzle package to Node API which will call the respective function for scraping and sent data back to Drupal as JSON response.

Now the question arises, why Nodejs? why not any other language?

This is because I find x-ray module very appropriate for this project as it has inbuilt pagination feature, plugin system and supports phantomjs for dynamic scraping, delays, throttle, timeout and many more feature. One of the other reason for using Node.js is because Drupal is adopting some frameworks which are powered by Node.js like Nightwatch for functional browser testing. Here is the link to the post on Drupal Planet regarding the same - Nightwatch in Drupal Core

REST is one of the most popular ways of making Web Services work. By providing REST API, Drupal can interact with the scripts by sending HTTP request and exchange data in JSON format. REST is now the Drupal Standards.

Difficulties

- Most of the websites are using latest Front-end Frameworks like Angular, React e.t.c so to fetch dynamic content from such websites, we need to use a Headless browser for which I am using x-ray in Node.

- My end semester exams are going on which will end on 24th of May so I won't be able to work with my full potential, I will try to work as much as I can during this interval and will cover my work after my exams positively.

In my previous blog post Google Summer of Code 2018 - Blog 1 - preparation drupal, I have shared how I get started with open source and how I prepared for GSoC'18 in last one month. Finally, the wait is over and the results came out. I am selected in Google Summer of Code 2018 under Drupal Organization. It is one of the happiest moments in my life till now. My attempts/efforts from the first year finally result in a success. I thank each and everyone who helped me on my way to achieving this. A big thanks to my mentors and friends who motivated me and helped me. Result Page

Project Abstract

I am working on "Developing a “ Product Advertising API ” module for Drupal 8" - #7. The “Product Advertising API” which is renamed to "Affiliates Connect" module provides an interface to easily integrate with affiliate APIs or product advertising APIs provided by different e-commerce platforms like Flipkart, Amazon, eBay etc to fetch data easily from their database and import them into Drupal to monetize your website by advertising their products. If the e-commerce platforms don't provide affiliate APIs, we will scrape the data from such platforms.

Community Bonding Phase

Before Coding starts, Around 15-20 days are given to students to get familiar with mentors and organization people. It helps us in discussions and builds a good connection. I created a room on IRC and added all the mentors there to discuss things related to project. Dibyajyoti Panda, one of my mentor has assigned me a task before the coding phase. As per my task, I have to create a development environment so that it can help developers as well as mentors to review our code easily. I find this idea great and started working on this task.

This project helps in managing Drupal Workflow using Ansible, Docker, Git, and Composer. You can manage multiple sites like production/development and staging servers through this project, All your environment will setup, install dependencies and Sync Configuration in a single click.

I have worked on Git and Composer before but Ansible and Docker are like alien technologies for me. I love to explore new technologies so I brewed my coffee and selected my favorite playlist and started learning Ansible and Docker. I find both of them so cool that I am thinking now why I was not aware of such things before. Literally, I am in love with Docker. I want to explore and learn Docker more.

As of now, Most of the work on this project is completed, Dbjpanda is reviewing my code continuously on Github and suggested improvements on which I am working on. I had a Hangout session with him to show my progress. Continuous Integration and Deployment are also equally important for any good project. I am using Travis CI for it where a user can check whether his tests are working fine/ or his push has broken the code. After the successful build, Travis will deploy the code on the deploying server automatically using Ansible.

Connecting People

During the preparation for GSoC'18, I was continuously contributing to Drupal queues, writing patches and reviewing patches. Each credit for resolving issues inspires me to work harder. I connected with Joris Vercammen while writing patches and on request, he reviewed my proposal and his feedbacks helped me in improving my proposal.

I got in touch with other students who get selected in GSoC this year. I talked to Chiranjeeb Mahanta and helped him in installing Drupal with docker through the project I am working on.

While working on the project, I get in touch with Paritoshik Paul on LinkedIn through Dbjpanda. We have a small conversation over this project as Paritoshik has created a similar module in D7 i.e Flipkart API so he is also joining us as a mentor. I also find that Getulio Valentin Sánchez also wants to mentor this project, so I am trying to get in touch with him. I have sent a mail to him through the Drupal contact form, waiting for his approval. I will work hard and learn new stuff while working with such experienced folks.

One of the struggles that developers face when moving to Drupal 8 is the lack of best practices in deploying Drupal sites. The Challenges in deployment- Dependency Management, Drupal Contrib Modules/Themes, Configuration Management and of course Code Base. Drupal 7 has no such problems. Ahhh, Drupal 8 comes with lots of stuff to manage. One of the biggest change in Drupal 8 is the adoption of Composer. Good things come at a price.

We will use one codebase for one Drupal site and use git for version control and deployment.

Composer

Composer is a dependency manager for PHP (like npm for Node, pip for Python). Drupal core uses Composer to manage core dependencies like Symfony components and Guzzle. Composer allows us to systematically manage a list of dependencies and their subsidiary dependencies. Composer install these dependencies via a manifest file called as composer.json.

This composer.json file contains the dependencies that the project requires which is installed by running -

composer install

for the first time. It locates, downloads, validates and loads the packages. It also ensures that exactly the right versions for each package are used and maintains the log file called composer.lock.

Note: Always Commit your composer.lock file because it contains the exact version of the dependencies that you have defined in the project.

If you want to update any specific package, it's a good practice to run this command -

composer update package/package-name

You should never run composer update because composer will try to update every single dependency which can cause problems to your site.

It will automatically install a Drupal site with all the dependencies and also install Drupal console and Drush locally.

Composer is one of the fastest ways to install dependencies as it caches the dependencies and next time loads data from the cache.

Directory Structure:

It is different from the Drupal directory structure. You can resemble web directory with the public directory which contains Drupal files. All third party dependencies are outside the web folder.

You can install any Drupal Modules, themes, and profiles through composer which will be downloaded in the contrib folder inside the modules, themes, profiles respectively. In this way, composer.lock file will have a record of all the Drupal contrib modules along with the third party dependencies.

To download any modules, themes using composer-

composer require drupal/mediumish_blog

# For installing theme, we will use drupal console

drupal theme:install mediumish_blog

Gitignore:

As all the dependencies and Drupal contrib modules, themes are managed by composer so we will not push these content to git.

Configuration Management

Deployment and configuration management are common actions of a project life cycle. We have installed various modules and configure our local site but our production site has no such configuration.

In Drupal 7, we have features module which is used to sync configuration. But Drupal 8 has an inbuilt solution for managing configurations which allows you to export complete website configurations and store them in YAML files. Exported files can be imported to another website to get the same result. Drupal’s configuration system helps to solve the config files synchronization problem in two ways: a unified way to store configuration and a process to import/export changes between instances of the same site.

How to synchronize config files

Open /web/sites/default/settings.php and set $config_directories['sync']

$config_directories['sync'] = '../config/sync';

It's good practice to store config files outside of the web directory to avoid making them accessible from the Internet.

Now Use Drupal console to export the configuration -

drupal config:export

# import on prod server

drupal config:import

Note: It is required that both the production and local Drupal site should have same UUIDs. More info

Generally, we face problems in updating Drupal core, composer has a simple way to manage this too-

composer update drupal/core --with-dependencies

It will update the Drupal core and all its associated dependencies.

Managing Environment Configuration

The best part that I was looking for as a developer in Drupal is managing different environment configuration. This can be done by using vlucas/phpdotenv module which also comes with drupal composer template.

Anything that is likely to change between deployment environments – such as database credentials or credentials for 3rd party services – should be extracted from the code into environment variables Basically, a .env file is an easy way to load custom configuration variables that your application needs without having to modify any other files.

Rename .env.example to .env file and add all the credentials as a key-value pair in .env file.

load.environment.php file in the root will load this .env file and make it available for you.

So, in this way, if you set your APP_ENV='local', Twig debug will be enabled and on production, you can disable by setting APP_ENV='prod'. You can also configure different configuration for different environments.

Conclusion

Drupal 8 offers a built-in solution for exporting and importing site configurations which is way better than what you can do in D7. Dependencies and contrib modules/themes are managed by the composer itself. It’s not yet perfect, there is no standard approach, but the workflow described above is a simple and efficient solution. You can define your own workflow based on your need.

My journey to open source and its community started with my college life. I was introduced to open source through Nibble Computer Society, an official Computer Science Society of our college. Later, I joined the society as a Web Developer. Here in this blog, I will share my experience how i started with open source and preparation of GSoC’18.

My passion for programming and development helped me in exploring open source community. I preferred some online courses which helped me in starting with Git and Github over others. Once i got familiar with version control, I pushed all my code to Github and it helped me in development as well as collaboration with other developers.

Google Summer of Code

Google Summer of Code is a global program focused on bringing more student developers into open source software development.
Students work with an open source organization on a 3 month programming project during their break from school.

Deshraj Yadav, One of the member of our Society and GSoC'15 student, introduced us with the Google Summer of Code and told about its perks. I started preparing for GSoC’16 but due to lack of experience and skills, I was not selected for GSoC'16.

Drupal

When Google announced the organizations selected for the GSoC'18, I searched for two organizations, One is Phpmyadmin and other is Drupal. I was contributing to Phpmyadmin from some time.

While reading project ideas of Drupal, I found project 7 interesting because it involves the web scrapping and i am quite familiar with web scraping. The project idea is to create a module which let bloggers and advertisers to easily fetch product’s data through affiliate APIs provided by different e-commerce platforms or through scraping and monetize websites by advertising their products.

After selecting the project, I contacted the mentors assigned for that project. I got response from Dibyajyoti Panda, one of my mentor, within an hour which excited me for this project. I did these things —

I joined the Drupal-Google IRC channel in order to connect with Drupal Community and other students.

I completed the Drupal Ladder which helped me getting started with Drupal.

My mentors guided and helped me a lot in learning Drupal because it’s really important to get familiar with Drupal before writing proposal for the project idea. I created a Trello Board and added my mentors to let them know about my progress and get feedbacks/reviews on tasks assigned to me. I contributed to Drupal Core by writing some patches and earned 19+ credits before proposal submission in one month.

The Proposal Deadline

I started working on my proposal writing after completing my tasks assigned by my mentors. Proposal describes what you hope to accomplish, why those objectives are important, and how you intend to achieve them. It also tells about the student’s past experiences and works. It is important to make a good proposal which let readers know about my idea and implementation procedure. So, I added some flowcharts/workflow diagrams and mockups. I completed my proposal 2 days before the deadline (March 27, 2018–21:30 IST) and asked for my mentor’s advice and feedback. I submitted my final proposal few hours before the deadline.

Application Review Period

I am revising PHP OOPs concept and understanding Symfony Components, Drupal APIs and research on how the scrapping can be optimized to scrape thousands of product’s data so that if I am selected for the project, i can give more time in coding and development. This was my experience in project submission period.