mahout git commit: NO-JIRA fix typos closes #356

r***@apache.org

2018-06-18 15:37:41 UTC

Repository: mahout
Updated Branches:
refs/heads/master 7dff35bc3 -> edb29e5f2

NO-JIRA fix typos closes #356

commit 91bdf2c288a2547191cb2a19955ae2fb8bc0c582
Author: Alessandro Buggin <***@users.noreply.github.com>
Date: Sun May 20 23:36:13 2018 +0100

Fix typos

I was reading the index, found some typos and thought to just PR the fixes.

Cheers

Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/edb29e5f
Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/edb29e5f
Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/edb29e5f

Branch: refs/heads/master
Commit: edb29e5f2e7b4acf288cf106b7a58d0b3b80c965
Parents: 7dff35b
Author: Trevor a.k.a @rawkintrevo <***@gmail.com>
Authored: Mon Jun 18 10:34:03 2018 -0500
Committer: Trevor a.k.a @rawkintrevo <***@gmail.com>
Committed: Mon Jun 18 10:36:37 2018 -0500

----------------------------------------------------------------------
website/docs/latest/index.md | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/mahout/blob/edb29e5f/website/docs/latest/index.md
----------------------------------------------------------------------
diff --git a/website/docs/latest/index.md b/website/docs/latest/index.md
index 980aa01..6c4761e 100755
--- a/website/docs/latest/index.md
+++ b/website/docs/latest/index.md
@@ -32,7 +32,7 @@ application, and then invoke Mahout's mathematically expressive Scala DSL when y

## Samsara Scala-DSL (Syntactic Sugar)

-So when you get to a point in your code where you're ready to math it up (in this example Spark) you can elegently express
+So when you get to a point in your code where you're ready to math it up (in this example Spark) you can elegantly express
yourself mathematically.

implicit val sdc: org.apache.mahout.sparkbindings.SparkDistributedContext = sc2sdc(sc)
@@ -42,7 +42,7 @@ yourself mathematically.

val C = A.t %*% A + A %*% B.t

-We've defined a `MahoutDistributedContext` (which is a wrapper on the Spark Context), and two Disitributed Row Matrices (DRMs)
+We've defined a `MahoutDistributedContext` (which is a wrapper on the Spark Context), and two Distributed Row Matrices (DRMs)
which are wrappers around RDDs (in Spark).

## Logical / Physical DAG
@@ -54,7 +54,7 @@ At this point there is a bit of optimization that happens. For example, conside
Which is
<center>\(\mathbf{A^\intercal A}\)</center>

-Transposing a large matrix is a very expensive thing to do, and in this case we don't actually need to do it. There is a
+Transposing a large matrix is a very expensive thing to do, and in this case we don't actually need to do it: there is a
more efficient way to calculate <foo>\(\mathbf{A^\intercal A}\)</foo> that doesn't require a physical transpose.

(Image showing this)
@@ -64,39 +64,38 @@ Mahout converts this code into something that looks like:
OpAtA(A) + OpABt(A, B) // illustrative pseudocode with real functions called

There's a little more magic that happens at this level, but the punchline is _Mahout translates the pretty scala into a
-a series of operators, which at the next level are turned implemented at the engine_.
+a series of operators, which are implemented at engine level_.

## Engine Bindings and Engine Level Ops

-When one creates new engine bindings, one is in essence defining
+When one creates new engine bindings, one is in essence defining:

1. What the engine specific underlying structure for a DRM is (in Spark its an RDD). The underlying structure also has
rows of `MahoutVector`s, so in Spark `RDD[(index, MahoutVector)]`. This will be important when we get to the native solvers.
1. Implementing a set of BLAS (basic linear algebra) functions for working on the underlying structure- in Spark this means
implementing things like `AtA` on an RDD. See [the sparkbindings on github](https://github.com/apache/mahout/tree/master/spark/src/main/scala/org/apache/mahout/sparkbindings)

-Now your mathematically expresive Samsara Scala code has been translated into optimized engine specific functions.
+Now your mathematically expressive Samsara Scala code has been translated into optimized engine specific functions.

## Native Solvers

-Recall how I said the rows of the DRMs are `org.apache.mahout.math.Vector`. Here is where this becomes important. I'm going
+Recall how I said that rows of the DRMs are `org.apache.mahout.math.Vector`. Here is where this becomes important. I'm going
to explain this in the context of Spark, but the principals apply to all distributed backends.

If you are familiar with how mapping and reducing in Spark, then envision this RDD of `MahoutVector`s, each partition,
-and indexed collection of vectors is a _block_ of the distributed matrix, however this _block_ is totally incore, and therefor
-is treated like an in core matrix.
+and indexed collection of vectors is a _block_ of the distributed matrix, however this _block_ is totally in-core, and therefor is treated like an in-core matrix.

-Now Mahout defines its own incore BLAS packs and refers to them as _Native Solvers_. The default native solver is just plain
+Now Mahout defines its own in-core BLAS packs and refers to them as _Native Solvers_. The default native solver is just plain
old JVM, which is painfully slow, but works just about anywhere.

-When the data gets to the node and an operation on the matrix block is called. In the same way Mahout converts abstract
-operators on the DRM that are implemented on various distributed engines, it calls abstract operators on the incore matrix
+When the data gets to the node, an operation on the matrix block is called. In the same way Mahout converts abstract
+operators on the DRM that are implemented on various distributed engines, it calls abstract operators on the in-core matrix
and vectors which are implemented on various native solvers.

-The default "native solver" is the JVM, which isn't native at all- and if no actual native solvers are present operations
+The default "native solver" is the JVM, which isn't native at all, and if no actual native solvers are present operations
will fall back to this. However, IF a native solver is present (the jar was added to the notebook), then the magic will happen.

-Imagine still we have our Spark executor- it has this block of a matrix sitting in its core. Now let's suppose the `ViennaCl-OMP`
+Imagine still we have our Spark executor: it has this block of a matrix sitting in its core. Now let's suppose the `ViennaCl-OMP`
native solver is in use. When Spark calls an operation on this incore matrix, the matrix dumps out of the JVM and the
calculation is carried out on _all available CPUs_.

@@ -105,6 +104,6 @@ In a similar way, the `ViennaCL` native solver dumps the matrix out of the JVM a
Once the operations are complete, the result is loaded back up into the JVM, and Spark (or whatever distributed engine) and
shipped back to the driver.

-The native solver operatoins are only defined on `org.apache.mahout.math.Vector` and `org.apache.mahout.math.Matrix`, which is
-why it is critical that the underlying structure composed row-wise of `Vector` or `Matrices`.
+The native solver operations are only defined on `org.apache.mahout.math.Vector` and `org.apache.mahout.math.Matrix`, which is
+why it is critical that the underlying structure is composed row-wise by `Vector` or `Matrices`.