fishcharlie

joined 2 years ago
[–] fishcharlie@eventfrontier.com 80 points 4 days ago (4 children)

This is not a “mistake”. This clearly proves they have Apple TV app integration implemented (just turned off). And someone accidentally turned it on.

But they have clearly put in effort and work into adding this functionality.

New functionality doesn’t just happen by mistake.

[–] fishcharlie@eventfrontier.com 2 points 2 weeks ago (1 children)

Got it. Thanks so much for your help!! Still a lot to learn here.

Coming from a world of building software where things are very binary (it works or it doesn't), it's also really tough to judge how good is "good enough". There is a point of diminishing returns, and not sure at what point to say that it's good enough vs continuing to learn and improve it.

Really appreciate your help here tho.

[–] fishcharlie@eventfrontier.com 1 points 2 weeks ago (3 children)

So someone else suggested to reduce the learning rate. I tried that and at least to me it looks a lot more stable between runs. All the code is my original code (none of the suggestions you made) but I reduced the learning rate to 0.00001 instead of 0.0001.

Not quite sure what that means exactly tho. Or if more adjustments are needed.

As for the confusion matrix. I think the issue is the difference between smoothed values in TensorBoard vs the actual values. But I just ran it again with the previous values to verify. It does look like it matches up if you look at the actual value instead of the smoothed value.

[–] fishcharlie@eventfrontier.com 1 points 3 weeks ago* (last edited 3 weeks ago) (5 children)

Sorry for the delayed reply. I really appreciate your help so far.

Here is the raw link to the confusion matrix: https://eventfrontier.com/pictrs/image/1a2bc13e-378b-4920-b7f6-e5b337cd8c6f.webm

I changed it to keras.layers.Conv2D(16, 10, strides=(5, 5), activation='relu'). Dense units still at 64.

And in case the confusion matrix still doesn't work, here is a still image from the last run.

EDIT: The wrong image was uploaded originally.

[–] fishcharlie@eventfrontier.com 1 points 4 weeks ago (7 children)

Ok I changed the Conv2D layer to be 10x10. I also changed the dense units to 64. Here is just a single run of that with a Confusion Matrix.

I don't really see a bias towards non-blurred images.

[–] fishcharlie@eventfrontier.com 1 points 1 month ago (9 children)

So does the fact that they aren't converging near the same point indicate there is a problem with my architecture and model design?

[–] fishcharlie@eventfrontier.com 1 points 1 month ago (11 children)

Got it. I'll try with some more values and see what that leads to.

So does that mean my learning rate might be too high and it's overshooting the optimal solution sometimes based on those random weights?

I think what you’re referring to with iterating through algorithms and such is called hyper parameter tuning. I think there is a tool called Keras Tuner you can use for this.

However. I’m incredibly skeptical that will work in this situation because of how variable the results are between runs. I run it with the same input, same code, everything, and get wildly different results. So I think in order for that to be effective it needs to be fairly consistent between runs.

I could be totally off base here tho. (I haven’t worked with this stuff a ton yet).

[–] fishcharlie@eventfrontier.com 1 points 1 month ago (13 children)

Thanks so much for the reply!

The convolution size seems a little small

I changed this to 5 instead of 3, and hard to tell if that made much of an improvement. It still is pretty inconsistent between training runs.

If it doesn’t I’d look into reducing the number of filters or the dense layer. Reducing the available space can force an overfitting network to figure out more general solutions

I'll try reducing the dense layer from 128 to 64 next.

Lastly, I bet someone else has either solved the same problem as an exercise or something similar and you could check out their network architecture to see if your solution is in the ballpark of something that works

This is a great idea. I did a quick Google search and nothing stood out to start. But I'll dig deeper more.


It's still super weird to me that with zero changes how variable it can be. I don't change anything, and one run it is consistently improving for a few epochs, the next run it's a lot less accurate to start and declines after the first epoch.

 

I'm trying to train a machine learning model to detect if an image is blurred or not.

I have 11,798 unblurred images, and I have a script to blur them and then use that to train my model.

However when I run the exact same training 5 times the results are wildly inconsistent (as you can see below). It also only gets to 98.67% accuracy max.

I'm pretty new to machine learning, so maybe I'm doing something really wrong. But coming from a software engineering background and just starting to learn machine learning, I have tons of questions. It's a struggle to know why it's so inconsistent between runs. It's a struggle to know how good is good enough (ie. when should I deploy the model). It's a struggle to know how to continue to improve the accuracy and make the model better.

Any advice or insight would be greatly appreciated.

View all the code: https://gist.github.com/fishcharlie/68e808c45537d79b4f4d33c26e2391dd

[–] fishcharlie@eventfrontier.com 4 points 2 months ago

That’s attached to the instance? Do you have a screenshot maybe?

[–] fishcharlie@eventfrontier.com 1 points 2 months ago (3 children)

What is the error that you get?

[–] fishcharlie@eventfrontier.com 3 points 2 months ago (1 children)

Yes. It just will fill your feed with a bunch of things you might not care about. But admin vs non admin doesn’t matter in the context of what I said.

 

I just learned that the Eve Energy smart plugs transmit energy consumption information via Matter. I didn't think energy consumption information was supported in Matter yet, but it is.

This makes them incredible to use with the Home Assistant Energy dashboard.

Even tho I was hesitant for a while, I took the leap to using the Matter beta Home Assistant integration and no issues so far.

 

It seems like running a pictrs server is optional when running Lemmy. I'm trying to figure out if a given instance supports pictrs.

I see in the documentation for pictrs, there is a GET /healthz endpoint. However when I try to access https://lemmy.ml/pictrs/healthz for example it gives me a 404. Even tho I know that Lemmy.ml has a pictrs server.

What is the best way to determine if a Lemmy server has pictrs?

 

cross-posted from: https://eventfrontier.com/post/177049

I keep getting an error ValueError: perm should have the same length as rank(x): 3 != 2 when trying to convert my model using coremltools.

From my understanding the most common case for this is when your input shape that you pass into coremltools doesn't match your model input shape. However, as far as I can tell in my code it does match. I also added an input layer, and that didn't help either.

I have put a lot of effort into reducing my code as much as possible while still giving a minimal complete verifiable example. However, I'm aware that the code is still a lot. Starting at line 60 of my code is where I create my model, and train it.

I'm running this on Ubuntu, with NVIDIA setup with Docker.

Any ideas what I'm doing wrong?


from typing import TypedDict, Optional, List
import tensorflow as tf
import json
from tensorflow.keras.optimizers import Adam
import numpy as np
from sklearn.utils import resample
import keras
import coremltools as ct

# Simple tokenizer function
word_index = {}
index = 1
def tokenize(text: str) -> list:
    global word_index
    global index
    words = text.lower().split()
    sequences = []
    for word in words:
        if word not in word_index:
            word_index[word] = index
            index += 1
        sequences.append(word_index[word])
    return sequences

def detokenize(sequence: list) -> str:
    global word_index
    # Filter sequence to remove all 0s
    sequence = [int(index) for index in sequence if index != 0.0]
    words = [word for word, index in word_index.items() if index in sequence]
    return ' '.join(words)

# Pad sequences to the same length
def pad_sequences(sequences: list, max_len: int) -> list:
    padded_sequences = []
    for seq in sequences:
        if len(seq) > max_len:
            padded_sequences.append(seq[:max_len])
        else:
            padded_sequences.append(seq + [0] * (max_len - len(seq)))
    return padded_sequences

class PreprocessDataResult(TypedDict):
    inputs: tf.Tensor
    labels: tf.Tensor
    max_len: int

def preprocess_data(texts: List[str], labels: List[int], max_len: Optional[int] = None) -> PreprocessDataResult:
    tokenized_texts = [tokenize(text) for text in texts]
    if max_len is None:
        max_len = max(len(seq) for seq in tokenized_texts)
    padded_texts = pad_sequences(tokenized_texts, max_len)

    return PreprocessDataResult({
        'inputs': tf.convert_to_tensor(np.array(padded_texts, dtype=np.float32)),
        'labels': tf.convert_to_tensor(np.array(labels, dtype=np.int32)),
        'max_len': max_len
    })

# Define your model architecture
def create_model(input_shape: int) -> keras.models.Sequential:
    model = keras.models.Sequential()

    model.add(keras.layers.Input(shape=(input_shape,), dtype='int32', name='embedding_input'))
    model.add(keras.layers.Embedding(input_dim=10000, output_dim=128)) # `input_dim` represents the size of the vocabulary (i.e. the number of unique words in the dataset).
    model.add(keras.layers.Bidirectional(keras.layers.LSTM(units=64, return_sequences=True)))
    model.add(keras.layers.Bidirectional(keras.layers.LSTM(units=32)))
    model.add(keras.layers.Dense(units=64, activation='relu'))
    model.add(keras.layers.Dropout(rate=0.5))
    model.add(keras.layers.Dense(units=1, activation='sigmoid')) # Output layer, binary classification (meaning it outputs a 0 or 1, false or true). The sigmoid function outputs a value between 0 and 1, which can be interpreted as a probability.

    model.compile(
        optimizer=Adam(),
        loss='binary_crossentropy',
        metrics=['accuracy']
    )

    return model

# Train the model
def train_model(
    model: tf.keras.models.Sequential,
    train_data: tf.Tensor,
    train_labels: tf.Tensor,
    epochs: int,
    batch_size: int
) -> tf.keras.callbacks.History:
    return model.fit(
        train_data,
        train_labels,
        epochs=epochs,
        batch_size=batch_size,
        callbacks=[
            keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=5),
            keras.callbacks.TensorBoard(log_dir='./logs', histogram_freq=1),
            # When downgrading from TensorFlow 2.18.0 to 2.12.0 I had to change this from `./best_model.keras` to `./best_model.tf`
            keras.callbacks.ModelCheckpoint(filepath='./best_model.tf', monitor='val_accuracy', save_best_only=True)
        ]
    )

# Example usage
if __name__ == "__main__":
    # Check available devices
    print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

    with tf.device('/GPU:0'):
        print("Loading data...")
        data = (["I love this!", "I hate this!"], [0, 1])
        rawTexts = data[0]
        rawLabels = data[1]

        # Preprocess data
        processedData = preprocess_data(rawTexts, rawLabels)
        inputs = processedData['inputs']
        labels = processedData['labels']
        max_len = processedData['max_len']

        print("Data loaded. Max length: ", max_len)

        # Save word_index to a file
        with open('./word_index.json', 'w') as file:
            json.dump(word_index, file)

        model = create_model(max_len)

        print('Training model...')
        train_model(model, inputs, labels, epochs=1, batch_size=32)
        print('Model trained.')

        # When downgrading from TensorFlow 2.18.0 to 2.12.0 I had to change this from `./best_model.keras` to `./best_model.tf`
        model.load_weights('./best_model.tf')
        print('Best model weights loaded.')

        # Save model
        # I think that .h5 extension allows for converting to CoreML, whereas .keras file extension does not
        model.save('./toxic_comment_analysis_model.h5')
        print('Model saved.')

        my_saved_model = tf.keras.models.load_model('./toxic_comment_analysis_model.h5')
        print('Model loaded.')

        print("Making prediction...")
        test_string = "Thank you. I really appreciate it."
        tokenized_string = tokenize(test_string)
        padded_texts = pad_sequences([tokenized_string], max_len)
        tensor = tf.convert_to_tensor(np.array(padded_texts, dtype=np.float32))
        predictions = my_saved_model.predict(tensor)
        print(predictions)
        print("Prediction made.")


        # Convert the Keras model to Core ML
        coreml_model = ct.convert(
            my_saved_model,
            inputs=[ct.TensorType(shape=(max_len,), name="embedding_input", dtype=np.int32)],
            source="tensorflow"
        )

        # Save the Core ML model
        coreml_model.save('toxic_comment_analysis_model.mlmodel')
        print("Model successfully converted to Core ML format.")

Code including Dockerfile & start script as GitHub Gist: https://gist.github.com/fishcharlie/af74d767a3ba1ffbf18cbc6d6a131089

 

I created a Lemmy community specifically for TensorFlow! Check it out and subscribe if you're interested.

 

Is there a way to determine if a user sign-up application is pending solely based on the server instance URL and the username?

It looks like even when the application is pending the profile page is created and is live. So that doesn’t really help me determine that.

 

cross-posted from: https://eventfrontier.com/post/150886

I'm pleased to announce the release of Echo for Lemmy! Echo is a Lemmy client for iPhone that I've been working on for a while and I'm excited to finally share it with you all.

Echo for Lemmy is a fully native iOS application built using fully native Apple SDKs. This means it feels right at home on your iPhone and is designed to be fast, efficient, and easy to use. No overhead from web views or cross-platform frameworks.

Here are some of the features available in Echo for Lemmy:

  • Connect with communities based on your interests.
  • Sort your feed by most active, trending posts, new posts, and many more.
  • Upvote and downvote posts & comments.
  • Powerful search experience to find the content you're looking for.
  • Create posts using share extension from any app on your device.
  • Bookmark posts to easily find later.
  • Fully native application with dark mode support & accessibility features.

Echo for Lemmy is available for free on the App Store, with subscription plans available for Echo+. You can download it here: Echo for Lemmy on the App Store.

You can also join the official Echo Lemmy community at [!echo@eventfrontier.com](/c/echo@eventfrontier.com).

I'm excited to hear feedback, suggestions, bug reports, and feature suggestions. Feel free to comment here, or create a new post! You can also reach out via email at support@rrainn.com.

This is only the beginning. Much more to come!


Download Echo for Lemmy: https://echo.rrainn.com/download/iphone

Echo Lemmy Community: !echo@eventfrontier.com

Echo Mastodon Profile: @echo@mstdn-social.com


Screenshot of Echo for Lemmy on an iPhone showing a list of posts in your home feed.

 

cross-posted from: https://eventfrontier.com/post/150886

I'm pleased to announce the release of Echo for Lemmy! Echo is a Lemmy client for iPhone that I've been working on for a while and I'm excited to finally share it with you all.

Echo for Lemmy is a fully native iOS application built using fully native Apple SDKs. This means it feels right at home on your iPhone and is designed to be fast, efficient, and easy to use. No overhead from web views or cross-platform frameworks.

Here are some of the features available in Echo for Lemmy:

  • Connect with communities based on your interests.
  • Sort your feed by most active, trending posts, new posts, and many more.
  • Upvote and downvote posts & comments.
  • Powerful search experience to find the content you're looking for.
  • Create posts using share extension from any app on your device.
  • Bookmark posts to easily find later.
  • Fully native application with dark mode support & accessibility features.

Echo for Lemmy is available for free on the App Store, with subscription plans available for Echo+. You can download it here: Echo for Lemmy on the App Store.

You can also join the official Echo Lemmy community at [!echo@eventfrontier.com](/c/echo@eventfrontier.com).

I'm excited to hear feedback, suggestions, bug reports, and feature suggestions. Feel free to comment here, or create a new post! You can also reach out via email at support@rrainn.com.

This is only the beginning. Much more to come!


Download Echo for Lemmy: https://echo.rrainn.com/download/iphone

Echo Lemmy Community: !echo@eventfrontier.com

Echo Mastodon Profile: @echo@mstdn-social.com


Screenshot of Echo for Lemmy on an iPhone showing a list of posts in your home feed.

 

Is there a way to get a list of users subscribed to a given Lemmy community? Trying to do some Lemmy wide data analysis using that information.

Or alternatively, is there a way to get a list of communities a given user is subscribed to?

 

After upgrading Lemmy from 0.18.5 to 0.19.1, the lemmy_server process is taking up 200-350+% of my CPU. Although I haven't seen it max out my CPU yet, it's getting dangerously close at times.

Any thing I can do to fix this?

I'd be fine with downgrading temporarily if this will require a Lemmy fix, but I'm not seeing any documentation on how to do that. I'm assuming DB migrations were run between those two versions (which might make that complicated).

view more: next ›