18 votes

What programming/technical projects have you been working on?

This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?

18 comments

  1. [8]
    Deimos
    (edited )
    Link
    I want to talk about something I ended up doing in the Tildes code yesterday, because—at least, in my opinion—I think it's fascinating. It's some combination of amazing and terrifying, and kind of...

    I want to talk about something I ended up doing in the Tildes code yesterday, because—at least, in my opinion—I think it's fascinating. It's some combination of amazing and terrifying, and kind of broke my mental model of the relationship between the application and the database. Bear with me for a bit, I'm going to continue being vague and explain how I got there first:

    On Tildes, when something has a unique ID, it's usually displayed in an alphanumeric way, which makes them shorter and looks better in urls. For example, this topic's ID is "ioq", which you can see in the url (.../~comp/ioq/...) as well as in the short link in the sidebar: https://tild.es/ioq. The ID isn't actually stored that way though, it's stored as a normal number and just converted to that format for display purposes. So when I need to take something I'm looking at on the site and look it up in the database, I always need to convert the alphanumeric ID back. The way I usually do this is to open a Python session and run a couple of commands, the same function used in the site code when I need to do that conversion:

    >>> from tildes.lib.id import id36_to_id
    >>> id36_to_id("ioq")
    24218
    

    ("id36" is a short form of "base-36 ID", with the 36 being the 26 letters + 10 digits each character can be)

    After doing that, I know the integer ID is 24218, and I can go to the database and do my select/update/whatever with WHERE topic_id = 24218. This is obviously pretty trivial overall, but it's a bit annoying to need to do whenever this situation comes up, so I wanted a way to just be able to do it in the database, like WHERE topic_id = id36_to_id('ioq').

    You can add custom functions like that, and I've already written some to implement trigger behavior. The "native" languages for PostgreSQL are fairly clunky though, and I can hardly write anything in them without constantly looking things up in the documentation. So even though this is a simple function, I didn't really want to rewrite it into one of those languages and need to remember to update it if I ever touch the Python function.

    I remembered that PostgreSQL supports writing functions in some flavor of Python, but I had never really looked into it before. I decided to try playing around with that a little and had the thought of, "I've already written this function in Python, can I just... import it, instead of copy-pasting it into a PostgreSQL function?" And it turns out that you can! I just had to set the PYTHONPATH environment variable appropriately in PostgreSQL to point to the application's virtualenv, and now the id36_to_id PostgreSQL function is this:

    CREATE OR REPLACE FUNCTION id36_to_id(id36 TEXT) RETURNS INTEGER AS $$
        from tildes.lib.id import id36_to_id
        return id36_to_id(id36)
    $$ IMMUTABLE LANGUAGE plpython3u;
    

    It just calls the application's function. This is obviously a really simplistic example of what you can do with it, but the upshot is that... all of the application's Python code (and all of the libraries it uses) can be available for use in the database itself. This is something that I never realized was even possible, and it's super interesting to me. Like I said, it's kind of ruined my mental concept of how an application needs to do certain things with its database.

    There's definitely danger to it too, and I have no idea how much I'll end up actually using it, but it's making me think about some new possibilities.

    22 votes
    1. [4]
      unknown user
      Link Parent
      oh no don't you tease us like that I like how obvious the imported function code is. I have some programming experience, and none in SQL, and I can still read perfectly what it's supposed to do....

      There's definitely danger to it too

      oh no don't you tease us like that

      I like how obvious the imported function code is. I have some programming experience, and none in SQL, and I can still read perfectly what it's supposed to do. JS doesn't have that unless you go out of your way.

      4 votes
      1. [3]
        Deimos
        (edited )
        Link Parent
        I'm not sure if there's any extremely specific danger, I just think adding complex code onto database operations is scary in general. You don't want something like a trigger function that executes...

        I'm not sure if there's any extremely specific danger, I just think adding complex code onto database operations is scary in general. You don't want something like a trigger function that executes whenever a comment is created to be slow or have a chance of crashing and causing data to not save correctly. It also makes it harder for people to know it's happening, since it turns into something that "magically" happens in the database, instead of being able to see it in the main application code.

        3 votes
        1. [2]
          skybrian
          Link Parent
          Yes, and I also wonder about version skew and deployment. When a Python file on disk changes, when exactly does it start getting used in the database? If you replicate the database, how do you...

          Yes, and I also wonder about version skew and deployment. When a Python file on disk changes, when exactly does it start getting used in the database? If you replicate the database, how do you make sure both databases have the same code, and what happens when they're temporarily out of sync during an upgrade?

          2 votes
          1. Deimos
            (edited )
            Link Parent
            Yeah, there's a lot of weird, scary potential and questions I don't know the answers to. I get the impression that the functions get "reinterpreted" for each connection. If I open a psql session...

            Yeah, there's a lot of weird, scary potential and questions I don't know the answers to.

            I get the impression that the functions get "reinterpreted" for each connection. If I open a psql session and use my id36_to_id function, the first time is slow, but any following calls are fast. If I quit and open a new session, the first call is slow again.

            For the site, I'm using PgBouncer to prevent connections from being recreated all the time like that, so that "refresh" shouldn't be happening very often. But then, if I want to refresh it, I'll need to remember to get PgBouncer to recreate its connections. And how does it behave if I use imported Python functions inside triggers? Are those connection-specific? I have no idea (but I could figure out a lot of it with testing, if I wanted to).

            I don't think replication would be affected (depending on the type of replication) because I think it just replicates the end state of the data, so triggers and such shouldn't re-run, but I'm not absolutely sure about that either.

            3 votes
    2. [2]
      Comment deleted by author
      Link Parent
      1. Deimos
        Link Parent
        I don't know much about other databases' support for writing custom functions. I think they generally support it in some way, but I don't know if there's as much flexibility about language. From a...

        I don't know much about other databases' support for writing custom functions. I think they generally support it in some way, but I don't know if there's as much flexibility about language. From a quick look, it looks like in MySQL, they generally have to be written in C or C++. I used to occasionally write some in Oracle at one of my old jobs, but they would always be written in its native "PL/SQL" language (which is similar to PostgreSQL's native "PL/pgSQL" one). Their docs sound like it should be possible to write them in Java and some other languages too.

        PostgreSQL has 4 included "procedural languages": PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python. It's also possible to add more than that, and I know I've heard of people using Lua and some other languages for their functions.

        2 votes
    3. BuckeyeSundae
      Link Parent
      I, uh, program a lot in both python and postgres. Never really thought much about how the two could interact. This is really cool.

      I, uh, program a lot in both python and postgres. Never really thought much about how the two could interact. This is really cool.

      1 vote
  2. [5]
    dblohm7
    Link
    I recently switched teams at work, and I now work on GeckoView, which is Mozilla's next generation embedding solution for building Gecko-based browsers on Android. It is the embedding technology...

    I recently switched teams at work, and I now work on GeckoView, which is Mozilla's next generation embedding solution for building Gecko-based browsers on Android. It is the embedding technology underlying the new Firefox Preview (codenamed "Fenix").

    12 votes
    1. [2]
      skybrian
      Link Parent
      Just curious, do you get to use Rust much?

      Just curious, do you get to use Rust much?

      1 vote
      1. dblohm7
        (edited )
        Link Parent
        Not personally at this time. Our usage of Rust is highly focused on specific areas (contrary to the "Mozilla is rewriting everything in Rust" rumours that are constantly circulating). I have yet...

        Not personally at this time. Our usage of Rust is highly focused on specific areas (contrary to the "Mozilla is rewriting everything in Rust" rumours that are constantly circulating). I have yet to be involved much with any Rust components, though I expect that to change.

        EDIT: To elaborate on this a bit, the mantra within Gecko is, "Ignorance of Rust should not be the deciding factor as to whether or not you use it." Having said that, rewriting code for its own sake in a different language is not a good idea. Our recommendations are centered around focusing on areas where there are clear safety and/or concurrency wins. New, from-scratch components are likely to be written in Rust as well.

        3 votes
    2. [2]
      LewsTherinTelescope
      Link Parent
      Using Fenix right now, I appreciate all the work you guys have been putting into this stuff! Not sure if it's actually the case or just feels like it, but it feels like Fenix and GeckoView is...

      Using Fenix right now, I appreciate all the work you guys have been putting into this stuff! Not sure if it's actually the case or just feels like it, but it feels like Fenix and GeckoView is faster than Fennec was, so if it is, then thanks for that!

      1. dblohm7
        Link Parent
        Yes, it's definitely much faster!

        Yes, it's definitely much faster!

        2 votes
  3. [3]
    tlalexander
    Link
    I’m learning machine learning and using that to make my off road robot autonomous. The robot is called Rover and is all 3D printed from my own design. CC0 open source of course. I’ve got an NVIDIA...

    I’m learning machine learning and using that to make my off road robot autonomous. The robot is called Rover and is all 3D printed from my own design. CC0 open source of course. I’ve got an NVIDIA Jetson Xavier and my goal is to build an open source camera-based autonomy system for it. I’ve got a video showing it off below which includes some shots of my first foray in to training machine learning models for this robot. I took google’s pixel wise segmentation algorithm DeepLab and trained it to recognize “trail” and “not trail” on some images of a forest trail I took. Since this video I’ve ordered a four camera array of 13 megapixel cameras with hardware synchronize shutters made to go directly in to the Xavier’s camera inputs. I’ve also made a pipeline so I can go from photographs of the world to a 3D model of the world to simulation in that model for reinforcement learning in photorealistic environment. That uses COLMAP and Habitat-Sim.

    Here’s the video:
    https://youtu.be/BSRa2zZ6CtQ

    Bonus photo album of the robot and the 3D printed dual stage planetary gearboxes. So far those same gearboxes have lasted almost a year with no signs of wear!
    https://imgur.com/gallery/GqXD2Zj

    5 votes
    1. [2]
      DataWraith
      Link Parent
      That's incredibly cool! I've been reading many semantic segmentation papers lately while trying to improve an NN for localizing key fields on badly scanned documents. From what I read, DeepLab v3...

      That's incredibly cool!

      I've been reading many semantic segmentation papers lately while trying to improve an NN for localizing key fields on badly scanned documents. From what I read, DeepLab v3 is apparently considered a bit heavy/slow, so many lighter models have been devised for real-time use on drones or vehicles.

      As an aside: the semantic segmentation images in your YouTube video remind me of Stanley, Stanford's winning entry to the second DARPA Grand Challenge -- they used their LIDAR scanners to map out flat terrain in front of the vehicle and then extrapolated from that what the road looked like all the way to the horizon in real-time. It's amazing that that can be done from monocular images nowadays.

      3 votes
      1. tlalexander
        Link Parent
        Thanks! Yeah I figured this might be slow. I’m still learning things like how to properly label your own data and fine tune on a pre trained network, so I was happy to get results even if I need...

        Thanks! Yeah I figured this might be slow. I’m still learning things like how to properly label your own data and fine tune on a pre trained network, so I was happy to get results even if I need something different for use on the robot.

        And I rely heavily on paperswithcode.com so I’ll have to look around there for something faster once I’m ready for that. My plan in the near term is to get stereo and monocular depth estimation for the 4 cameras as well as visual odometry, and then work on higher level applications. But I may simultaneously do some basic trail segmentation for trail following so I will need to find a suitable real time segmentation algorithm. 😊

  4. BuckeyeSundae
    Link
    I've been working on <redacted>. I joke. Kinda. What I can say is that my recent projects have me doing Data Science on pretty large databases. Usually focused exclusively on the security side of...

    I've been working on <redacted>.

    I joke. Kinda.

    What I can say is that my recent projects have me doing Data Science on pretty large databases. Usually focused exclusively on the security side of the house. Using the prior events that we track to inform the validation process for the future content we create has been most of my last two weeks. Kind of building it from the ground up, so it's been a wild ride.

    One of the coolest things I unlocked is so, uh, application specific that it barely makes sense to relay. I figured out how to call the API of the application I'm working with to return the data that can inform my validating process. The cool thing is it will inform all my validating process I want to build out. The scary thing is could theoretically break the entire system the company uses for that data if I'm not careful, because I have to go after that data on the live environment, which is spooky as all hell.

    4 votes
  5. kjhanonichi
    Link
    none. my shits all fucked up, and there's still a lot happening to me while i'm trying to deal with a lot. it makes me really sad somedays because it feels like it's a part of myself that i'm...

    none. my shits all fucked up, and there's still a lot happening to me while i'm trying to deal with a lot. it makes me really sad somedays because it feels like it's a part of myself that i'm losing. the last few things i did was just some bash scripts for stuff i needed or wanted, like a little radio player that used mplayer/jq to keep track of internet/pirate radios and play them. or like updating my dotfiles/gentoo-config repos

    4 votes