Note: in the future I’d suggest making more atomic changes. As you delete thousands of files, I’m unable to understand a “diff” to master, so I’d look at separate commits for their meaning and review them one by one. The “remove external” would better be understood if “removing AprUtil lib from external”, “removing SWIG from external”, “Boost headers moved to /include/”, …

add all external header dependencies into /include/

we don’t have any of ours headers in include/, right? I’m just noting that having separate 3rd party code clearly from ours is useful.

replaced /src/nupic
replaced /src/test

replaced means moved, added something, …?

added /src/yaml-cpp

This is moved from external? Could it be /src/external/yaml-cpp then?

added /external/vs2017 which includes visual studio solution plus binaries. It does not work right now.

for the lib/, bin/ there…are those prebuilt binaries, or hosts build them themselves(then .gitignore)

how about keeping the external/ files as is (much less “changes” in PR)?

and putting the VS project in /IDE/vs2017 ?

removed /bindings/

rmoved and broke support for py-bindings (for now), or you are already able to replace that with pybind?

@breznak Thanks for looking into the changes. I agree it’s a huge change and I’m fairly certain it will be the only time like this. If you like we could do a more step by step approach to digest all the changes. I don’t mind redoing it. This way we could agree on the structure of changes.

For instance, I agree with you on having 3rd party stuff in a separate folder. The folder /include/ might not be the correct one. And it’s even incorrect for boost. We need some of the boost libraries like, system and filesystem and those sources are not in the repo right now.

Here is my suggested folder structure for external or should it be “3rd party”?

If you like we could do a more step by step approach to digest all the changes. I don’t mind redoing it. This way we could agree on the structure of changes.

If you don’t mind doing the rebasing for the 3rd time, I’d much appreciate doing a coordinated step-by-step merging. That way we’ll ensure tests pass, functionality stays and others can more easily discuss your changes on a “signel feature” PRs.

chhenning:

The folder /include/ might not be the correct one. And it’s even incorrect for boost. We need some of the boost libraries like, system and filesystem and those sources are not in the repo right now.

Good point. I don’t have a strong opinion on /include/ vs /external/xxx_lib , as both have some +/-. Or would it be possible to keep the current /external/{src,lib,include}/{boost, gtest,…} ?

chhenning:

/external/boost

If you are up for the 3rd round of changes, can we start with

the huge boost? Add the boost/windows stuff that you have. Optionally move it to the location (/include, /external/include/boost/, /external/boost) that you find fit.

add pybind, (and remove swig + …)

… I was guessing the changes you made in your fork.
The only requirement I’d like to achieve is to make PRs which always leave us in working state with tests and same level of features. Is that possible?

chhenning:

/exteranl/python36

I don’t see this in your code yet. Is it just amendments for py3, or do we really want to include and build from source all Py3, Py2.7?

If you don’t mind doing the rebasing for the 3rd time, I’d much appreciate doing a coordinated step-by-step merging. That way we’ll ensure tests pass, functionality stays and others can more easily discuss your changes on a “signel feature” PRs.

No, I don’t mind and it’s actually not much work. It’s important we get this right.

breznak:

Good point. I don’t have a strong opinion on /include/ vs /external/xxx_lib , as both have some +/-. Or would it be possible to keep the current /external/{src,lib,include}/{boost, gtest,…} ?

I would prefer all headers to be in include/ Including the external ones. Makes it easier to config builds, correct?

breznak:

the huge boost? Add the boost/windows stuff that you have. Optionally move it to the location (/include, /external/include/boost/, /external/boost) that you find fit.

What is boost/windows? You mean the prebuilt binaries?

Here is what I suggest in regards to boost:

add the complete boost source tree under external/boost

add the boost headers under include/boost

Same for all other dependencies.

The way I understand is that “HTM community” is self reliant. Meaning it keeps all dependencies in the repo. And in addition each dependency should be the complete source tree, including headers, source, tests, cmake scripts, etc.

breznak:

I was guessing the changes you made in your fork.

The only requirement I’d like to achieve is to make PRs which always leave us in working state with tests and same level of features. Is that possible?

Not sure this is possible for the big changes we doing now. How about we create a branch and do it there. Then when we are done we merge to master.

breznak:

I don’t see this in your code yet. Is it just amendments for py3, or do we really want to include and build from source all Py3, Py2.7?

I think we decided to stick to python 3. So no need to include python2.7 headers.

1/ rebase: perfect, thank you for your will to make this right, even with the obstacles!

2/ /include/ …yep, sounds good.

3/ boost …no, I noticed some new headers related to Windows platform (hope I was looking right).
Agree about the boost steps you’re suggesting, let’s start making PRs with the dependencies, and we can resolve the little details directly in the discussion above the codebase.

chhenning:

The way I understand is that “HTM community” is self reliant. Meaning it keeps all dependencies in the repo. And in addition each dependency should be the complete source tree, including headers, source, tests, cmake scripts, etc.

Hmm, I’m not sure about this one. We have been on/off in nupic with both camps - “we build all from source” vs. “try to use binary deps”. The All-Source makes easier to include the deps, but builds take longer (not much of a problem), but is more err prone (with regards to ‘something screwed up in my XYZ configuration’, mostly for non-dev users).
I think we included headers (no problems), and some libs that caused problems with official library (capnproto). Otherwise dependencies were mostly binary - via pip, numpy, python, swig, …

I would like to ask @rhyolight or some others who have been taking care of the dependencies, installs and users’ problems to weight in this question: “Use binary distribution of dependencies where possible, or rather build all ourselves from source”?

TL;DR: I’d shy away from including too much dependencies in our source build, rather make instructions to obtain them. Need advice from more involved ppl.

4/

Not sure this is possible for the big changes we doing now. How about we create a branch and do it there. Then when we are done we merge to master.

so at least as much as possible: we merge the deps, the restructuring you’ve made. Ideally we could also go on with the pybind replacement (?), then other changes can follow. This publish early, publish often approach would also make it simpler for people working on their forks/depending on your work to go head.

chhenning:

How about we create a branch and do it there. Then when we are done we merge to master.

Yes, that’s basically the situation now with your repo. Just make the changes in small enough batches to make it review-able.

chhenning:

I think we decided to stick to python 3. So no need to include python2.7 headers.

I would like to ask @rhyolight or some others who have been taking care of the dependencies, installs and users’ problems to weight in this question: “Use binary distribution of dependencies where possible, or rather build all ourselves from source”?

TL;DR: I’d shy away from including too much dependencies in our source build, rather make instructions to obtain them. Need advice from more involved ppl.

I agree with you. Bundling dependencies bloats the size of the distribution, and complicates installations where those binaries are pre-installed and customized.

Hmm, I’m not sure about this one. We have been on/off in nupic with both camps - “we build all from source” vs. “try to use binary deps”. The All-Source makes easier to include the deps, but builds take longer (not much of a problem), but is more err prone (with regards to ‘something screwed up in my XYZ configuration’, mostly for non-dev users).

I think we included headers (no problems), and some libs that caused problems with official library (capnproto). Otherwise dependencies were mostly binary - via pip, numpy, python, swig, …

I would like to ask @rhyolight or some others who have been taking care of the dependencies, installs and users’ problems to weight in this question: “Use binary distribution of dependencies where possible, or rather build all ourselves from source”?

TL;DR: I’d shy away from including too much dependencies in our source build, rather make instructions to obtain them. Need advice from more involved ppl.

I have a small sample project here that uses CMake and tool chains to create a project. Just proof of concept I could reach those dependencies without the include folder.

I added:

apr

apr-util

boost

yaml-cpp

zlib

CapnProto

You could take CapnProto out of the CMakeLists.txt, if you don’t want to build it. My hope was that you could use different tool chains to bring in those dependencies. Apt-Get for Unix systems? Not sure if this is the most robust solution. You are limitied - when using vcpkg - to the following ports: https://github.com/Microsoft/vcpkg/tree/master/ports.

Esp. since you provide the VCPKG (something like linux repositories, right?) for MS Win.

…reviewed the repo: I’d really like the reduction, for Unix systems with decent package managers it would be a no-brainer! For Win, I’m not sure how people would handle the VCPKG stuff ?? Or can we somehow automate that on Win?

Hi guys, if you guys don’t want to include the dependencies that’s fine with me. We could provide documentation on how to build the dependencies on the individual platforms. Let me know and I’ll make the adjustments.

Include all sources and headers
Include headers but no sources. cmake will have to be configured
Don’t provide headers nor sources. cmake will have to be configured

My preference is 1. but it has the already mentioned drawbacks of huge repo and slow builds.

I’d be deciding between 1, and 3, with preference for 3/no dependency sources. That is for those deps where there is a convenient method to obtain them. With biggest drawback being that “random” things mess up a users installs.

chhenning:

yaml-cpp (lib) I suggest we still include the source. Since nupic.core is using an outdated version which is hard to find.