Discussion:
mahout git commit: [WEBSITE] Move BuildingMahout.md
r***@apache.org
2017-11-29 19:25:25 UTC
Permalink
Repository: mahout
Updated Branches:
refs/heads/master e59101243 -> fe77fc19f


[WEBSITE] Move BuildingMahout.md


Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/fe77fc19
Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/fe77fc19
Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/fe77fc19

Branch: refs/heads/master
Commit: fe77fc19fc0c4d0c05c55a30a473acc71e30f1de
Parents: e591012
Author: Trevor a.k.a @rawkintrevo <***@gmail.com>
Authored: Wed Nov 29 13:25:14 2017 -0600
Committer: Trevor a.k.a @rawkintrevo <***@gmail.com>
Committed: Wed Nov 29 13:25:14 2017 -0600

----------------------------------------------------------------------
website/build_site.sh | 5 +
website/oldsite/developers/buildingmahout.md | 187 ++++++++++++++++++----
2 files changed, 164 insertions(+), 28 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mahout/blob/fe77fc19/website/build_site.sh
----------------------------------------------------------------------
diff --git a/website/build_site.sh b/website/build_site.sh
old mode 100755
new mode 100644
index 0a66962..d8502d8
--- a/website/build_site.sh
+++ b/website/build_site.sh
@@ -19,6 +19,7 @@ export PATH=${GEM_HOME}/bin:$PATH
(cd docs && bundle)
(cd docs && bundle exec jekyll build --destination $WORKDIR/docs/latest)

+
# Set env for docs
MAHOUT_VERSION=0.13.0
DISTFILE=apache-mahout-distribution-$MAHOUT_VERSION.tar.gz
@@ -37,4 +38,8 @@ rm -rf *
cp -a $WORKDIR/* .
git add .
git commit -m "Automatic Site Publish by Buildbot"
+<<<<<<< HEAD
+git push origin asf-site
+=======
git push origin asf-site
+>>>>>>> e591012439c04e98d669ef9732fde865a9ef76fa

http://git-wip-us.apache.org/repos/asf/mahout/blob/fe77fc19/website/oldsite/developers/buildingmahout.md
----------------------------------------------------------------------
diff --git a/website/oldsite/developers/buildingmahout.md b/website/oldsite/developers/buildingmahout.md
index 8e1e7f0..40b509b 100644
--- a/website/oldsite/developers/buildingmahout.md
+++ b/website/oldsite/developers/buildingmahout.md
@@ -1,16 +1,17 @@
---
layout: default
-title: BuildingMahout
-theme:
- name: retro-mahout
+title: Building Mahout
+theme:
+ name: mahout2
---

-# Building Mahout from source
+
+# Building Mahout from Source

## Prerequisites

* Java JDK 1.7
-* Apache Maven 3.3.3
+* Apache Maven 3.3.9


## Getting the source code
@@ -23,40 +24,170 @@ or

git clone https://github.com/apache/mahout.git

-##Hadoop version
-Mahout code depends on hadoop-client artifact, with the default version 2.4.1. To build Mahout against to a
-different hadoop version, hadoop.version property should be set accordingly and passed to the build command.
-Hadoop1 clients would additionally require hadoop1 profile to be activated.
+## Building From Source
+
+###### Prerequisites:
+
+Linux Environment (preferably Ubuntu 16.04.x) Note: Currently only the JVM-only build will work on a Mac.
+gcc > 4.x
+NVIDIA Card (installed with OpenCL drivers alongside usual GPU drivers)
+
+###### Downloads
+
+Install java 1.7+ in an easily accessible directory (for this example, ~/java/)
+http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
+
+Create a directory ~/apache/ .
+
+Download apache Maven 3.3.9 and un-tar/gunzip to ~/apache/apache-maven-3.3.9/ .
+https://maven.apache.org/download.cgi
+
+Download and un-tar/gunzip Hadoop 2.4.1 to ~/apache/hadoop-2.4.1/ .
+https://archive.apache.org/dist/hadoop/common/hadoop-2.4.1/
+
+Download and un-tar/gunzip spark-1.6.3-bin-hadoop2.4 to ~/apache/ .
+http://spark.apache.org/downloads.html
+Choose release: Spark-1.6.3 (Nov 07 2016)
+Choose package type: Pre-Built for Hadoop 2.4
+
+Install ViennaCL 1.7.0+
+If running Ubuntu 16.04+
+
+```
+sudo apt-get install libviennacl-dev
+```
+
+Otherwise if your distribution’s package manager does not have a viennniacl-dev package >1.7.0, clone it directly into the directory which will be included in when being compiled by Mahout:
+
+```
+mkdir ~/tmp
+cd ~/tmp && git clone https://github.com/viennacl/viennacl-dev.git
+cp -r viennacl/ /usr/local/
+cp -r CL/ /usr/local/
+```
+
+Ensure that the OpenCL 1.2+ drivers are installed (packed with most consumer grade NVIDIA drivers). Not sure about higher end cards.
+
+Clone mahout repository into `~/apache`.
+
+```
+git clone https://github.com/apache/mahout.git
+```
+
+###### Configuration
+
+When building mahout for a spark backend, we need four System Environment variables set:
+```
+ export MAHOUT_HOME=/home/<user>/apache/mahout
+ export HADOOP_HOME=/home/<user>/apache/hadoop-2.4.1
+ export SPARK_HOME=/home/<user>/apache/spark-1.6.3-bin-hadoop2.4
+ export JAVA_HOME=/home/<user>/java/jdk-1.8.121
+```
+
+Mahout on Spark regularly uses one more env variable, the IP of the Spark cluster’s master node (usually the node which one would be logged into).
+
+To use 4 local cores (Spark master need not be running)
+```
+export MASTER=local[4]
+```
+To use all available local cores (again, Spark master need not be running)
+```
+export MASTER=local[*]
+```
+To point to a cluster with spark running:
+```
+export MASTER=spark://master.ip.address:7077
+```
+
+We then add these to the path:
+
+```
+ PATH=$PATH$:MAHOUT_HOME/bin:$HADOOP_HOME/bin:$SPARK_HOME/bin:$JAVA_HOME/bin
+```
+
+These should be added to the your ~/.bashrc file.
+
+
+###### Building Mahout with Apache Maven
+
+From the $MAHOUT_HOME directory we may issue the commands to build each using mvn profiles.
+
+JVM only:
+```
+mvn clean install -DskipTests
+```
+
+JVM with native OpenMP level 2 and level 3 matrix/vector Multiplication
+```
+mvn clean install -Pviennacl-omp -Phadoop2 -DskipTests
+```
+JVM with native OpenMP and OpenCL for Level 2 and level 3 matrix/vector Multiplication. (GPU errors fall back to OpenMP, currently only a single GPU/node is supported).
+```
+mvn clean install -Pviennacl -Phadoop2 -DskipTests
+```
+
+### Changing Scala Version
+
+To change the Scala version used it is possible to use profiles, however the resulting artifacts seem to have trouble being resolved with SBT.
+
+```bash
+mvn clean install -Pscala-2.11
+```
+
+Maven is able to resolve the resulting artifacts effectively, this will also work if the goal is simply to use the Mahout-Shell. However if the goal is to build with SBT, the following tool should be used
+
+```bash
+cd $MAHOUT_HOME/buildtools
+./change-scala-version.sh 2.11
+```
+
+Now go back to `$MAHOUT_HOME` and execute
+
+```bash
+mvn clean install -Pscala-2.11
+```
+
+**NOTE:** you still need to pass the `-Pscala-2.11` profile, as this determines and propegates the minor scala version (e.g. 2.11.8)
+
+
+### The Distribution Profile

-The build lifecycle is illustrated below.
+The distribution profile, among other things, will produce the same artifact for multiple Scala and Spark versions.

-## Compiling
+Specifically, in addition to creating all of the

-Compile Mahout using standard maven commands
+Default Targets:
+- Spark 1.6 Bindings, Scala-2.10
+- Mahout-Math Scala-2.10
+- ViennaCL Scala-2.10*
+- ViennaCL-OMP Scala-2.10*
+- H2O Scala-2.10

- # With hadoop-2.4.1 dependency
- mvn clean compile
+It will also create:
+- Spark 2.0 Bindings, Scala-2.11
+- Spark 2.1 Bindings, Scala-2.11
+- Mahout-Math Scala-2.11
+- ViennaCL Scala-2.11*
+- ViennaCL-OMP Scala-2.11*
+- H2O Scala-2.11

- # With hadoop-1.2.1 dependency
- mvn -Phadoop1 -Dhadoop.version=1.2.1 clean compile
+Note: * ViennaCLs are only created if the `viennacl` or `viennacl-omp` profiles are activated.

-##Packaging
+By default, this phase will execute the `package` lifecycle goal on all built "extra" varients.

-Mahout has an extensive test suite which takes some time to run. If you just want to build Mahout, skip the tests like this
+E.g. if you were to run

- # With hadoop-2.4.1 dependency
- mvn -DskipTests=true clean package
+```bash
+mvn clean install -Pdistribution
+```

- # With hadoop-1.2.1 dependency
- mvn -Phadoop1 -Dhadoop.version=1.2.1 -DskipTests=true clean package
+You will `install` all of the "Default Targets" but only `package` the "Also created".

+If you wish to `install` all of the above, you can set the `lifecycle.target` switch as follows:

-In order to add mahout artifact to your local repository, run
+```bash
+mvn clean install -Pdistribution -Dlifecycle.target=install
+```

- # With hadoop-2.4.1 dependency
- mvn clean install

- # With hadoop-1.2.1 dependency
- mvn -Phadoop1 -Dhadoop.version=1.2.1 clean install

-
\ No newline at end of file

Loading...